Most developers know Fizz buzz. It’s a very common recruiter coding test problem, where you go through numbers and if a number is divisible by 3, you print “Fizz”, if it’s divisible by 5, you print “Buzz” and if it’s divisible by both, you print “FizzBuzz”.

A typical output looks like this (from 1 to 15):

## Implementing Fizz buzz in Tensorflow

You can split AI development in to 3 steps: gathering data, creating the model and training. A key to a great neural network is the data so let’s start by gathering some training data!

### Generating the data

Since neural network can only input and output floating point numbers between -1 and 1 (or 0 to 1), we need to improvise a bit to get the data to a neural network. Let’s input the numbers in binary and use 8 binary neurons to input it. That is enough to allow us to train the network from 0 to 2^8 (0-256), which is plenty for this example.

For the output, we have 4 different options: `Fizz`, `Buzz` and `FizzBuzz` and the number itself. We can use 4 binary outputs for this.

Since we already have `encode_input()` and `encode_fizzbuzz()` functions implemented, we can just use a simple loop to generate the data.

Now `x` contains the input data and `y` the expected output.

### Creating the model

As usual, the input shape is set to what we use as a input, so `input_dim=8` represents the binary input we’re giving the network. Since this is a classification problem, the last layer’s activation function should be `softmax`. This sets the sum of all outputs to 1, for example output `[1, 2, 3, 4]` would be `[0.0320586 , 0.08714432, 0.23688282, 0.64391426]`. The last layer size is also set to 4, because we have 4 possible outputs.

Now we just compile the model. We’re using `categorical_crossentropy` as a loss function for this classification network.

### Training the model

Next we simply call the `model.fit()` method to train the network. Setting `verbose=2` limits the output to a single line per epochs and `shuffle=True` shuffles the training data to get a bit better results. 1000 epochs seemed to work pretty well for 100 number Fizz Buzz.

### Output

Since binary output is pretty boring and not really human readable, now we just need to parse the output from the network. We could use `numpy.argmax()` here too, but I like to add as little external libs as possible.

Now we can try our model. Since the training was done for numbers 1-100, let’s just use the same range to test it with!

Now, finally, we can run the script and see if it works!

Wow, only 1 wrong answer (it was for the number `100`)! This is pretty promising.

## Conclusion

Yes, it will AI! This is super inefficient, but gets really accurate after 1000 epochs.

The true power of neural networks is that it also works with the data it has never seen before. You might be interested how this works after 1-100 so let’s try it out with numbers 100-150:

Well, it doesn’t work as well, but still beats picking the output by random and in my books, that’s a win! You can download the whole code from here and run it with `python fizzbuzz.py`.