Fizz buzz
Most developers know Fizz buzz. It’s a very common recruiter coding test problem, where you go through numbers and if a number is divisible by 3, you print “Fizz”, if it’s divisible by 5, you print “Buzz” and if it’s divisible by both, you print “FizzBuzz”.
A typical output looks like this (from 1 to 15):
1 | 1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, FizzBuzz |
Implementing Fizz buzz in Tensorflow
You can split AI development in to 3 steps: gathering data, creating the model and training. A key to a great neural network is the data so let’s start by gathering some training data!
Generating the data
Since neural network can only input and output floating point numbers between -1 and 1 (or 0 to 1), we need to improvise a bit to get the data to a neural network. Let’s input the numbers in binary and use 8 binary neurons to input it. That is enough to allow us to train the network from 0 to 2^8 (0-256), which is plenty for this example.
1 | def encode_input(num): |
For the output, we have 4 different options: Fizz
, Buzz
and FizzBuzz
and the number itself. We can use 4 binary outputs for this.
1 | def encode_fizzbuzz(num): |
Since we already have encode_input()
and encode_fizzbuzz()
functions implemented, we can just use a simple loop to generate the data.
1 | x = [] |
Now x
contains the input data and y
the expected output.
Creating the model
As usual, the input shape is set to what we use as a input, so input_dim=8
represents the binary input we’re giving the network. Since this is a classification problem, the last layer’s activation function should be softmax
. This sets the sum of all outputs to 1, for example output [1, 2, 3, 4]
would be [0.0320586 , 0.08714432, 0.23688282, 0.64391426]
. The last layer size is also set to 4, because we have 4 possible outputs.
1 | model = tf.keras.Sequential([ |
Now we just compile the model. We’re using categorical_crossentropy
as a loss function for this classification network.
1 | model.compile(tf.keras.optimizers.Adam(learning_rate=0.001), |
Training the model
Next we simply call the model.fit()
method to train the network. Setting verbose=2
limits the output to a single line per epochs and shuffle=True
shuffles the training data to get a bit better results. 1000 epochs seemed to work pretty well for 100 number Fizz Buzz.
1 | model.fit(x=x, y=y, |
Output
Since binary output is pretty boring and not really human readable, now we just need to parse the output from the network. We could use numpy.argmax()
here too, but I like to add as little external libs as possible.
1 | def decode_fizzbuzz(result, num): |
Now we can try our model. Since the training was done for numbers 1-100, let’s just use the same range to test it with!
1 | correct = 0 |
Now, finally, we can run the script and see if it works!
1 | $ python fizzbuzz.py |
Wow, only 1 wrong answer (it was for the number 100
)! This is pretty promising.
Conclusion
Yes, it will AI! This is super inefficient, but gets really accurate after 1000 epochs.
The true power of neural networks is that it also works with the data it has never seen before. You might be interested how this works after 1-100 so let’s try it out with numbers 100-150:
1 | 100 101 Fizz 103 104 105 106 107 Fizz 109 110 111 112 113 Fizz Fizz 116 Fizz Fizz Fizz Fizz 121 122 123 124 125 Fizz 127 FizzBuzz 129 130 Fizz 132 133 Fizz 135 136 Fizz Buzz 139 Fizz 141 142 FizzBuzz 144 145 Fizz 147 Fizz Fizz 150 |
Well, it doesn’t work as well, but still beats picking the output by random and in my books, that’s a win! You can download the whole code from here and run it with python fizzbuzz.py
.