# Fizz buzz

Most developers know Fizz buzz. It’s a very common recruiter coding test problem, where you go through numbers and if a number is divisible by 3, you print “Fizz”, if it’s divisible by 5, you print “Buzz” and if it’s divisible by both, you print “FizzBuzz”.

A typical output looks like this (from 1 to 15):

1 | 1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, FizzBuzz |

## Implementing Fizz buzz in Tensorflow

You can split AI development in to 3 steps: gathering data, creating the model and training. A key to a great neural network is the data so let’s start by gathering some training data!

### Generating the data

Since neural network can only input and output floating point numbers between -1 and 1 (or 0 to 1), we need to improvise a bit to get the data to a neural network. Let’s input the numbers in binary and use 8 binary neurons to input it. That is enough to allow us to train the network from 0 to 2^8 (0-256), which is plenty for this example.

1 | def encode_input(num): |

For the output, we have 4 different options: `Fizz`

, `Buzz`

and `FizzBuzz`

and the number itself. We can use 4 binary outputs for this.

1 | def encode_fizzbuzz(num): |

Since we already have `encode_input()`

and `encode_fizzbuzz()`

functions implemented, we can just use a simple loop to generate the data.

1 | x = [] |

Now `x`

contains the input data and `y`

the expected output.

### Creating the model

As usual, the input shape is set to what we use as a input, so `input_dim=8`

represents the binary input we’re giving the network. Since this is a classification problem, the last layer’s activation function should be `softmax`

. This sets the sum of all outputs to 1, for example output `[1, 2, 3, 4]`

would be `[0.0320586 , 0.08714432, 0.23688282, 0.64391426]`

. The last layer size is also set to 4, because we have 4 possible outputs.

1 | model = tf.keras.Sequential([ |

Now we just compile the model. We’re using `categorical_crossentropy`

as a loss function for this classification network.

1 | model.compile(tf.keras.optimizers.Adam(learning_rate=0.001), |

### Training the model

Next we simply call the `model.fit()`

method to train the network. Setting `verbose=2`

limits the output to a single line per epochs and `shuffle=True`

shuffles the training data to get a bit better results. 1000 epochs seemed to work pretty well for 100 number Fizz Buzz.

1 | model.fit(x=x, y=y, |

### Output

Since binary output is pretty boring and not really human readable, now we just need to parse the output from the network. We could use `numpy.argmax()`

here too, but I like to add as little external libs as possible.

1 | def decode_fizzbuzz(result, num): |

Now we can try our model. Since the training was done for numbers 1-100, let’s just use the same range to test it with!

1 | correct = 0 |

Now, finally, we can run the script and see if it works!

1 | $ python fizzbuzz.py |

Wow, only 1 wrong answer (it was for the number `100`

)! This is pretty promising.

## Conclusion

Yes, it will AI! This is super inefficient, but gets really accurate after 1000 epochs.

The true power of neural networks is that it also works with the data it has never seen before. You might be interested how this works after 1-100 so let’s try it out with numbers 100-150:

1 | 100 101 Fizz 103 104 105 106 107 Fizz 109 110 111 112 113 Fizz Fizz 116 Fizz Fizz Fizz Fizz 121 122 123 124 125 Fizz 127 FizzBuzz 129 130 Fizz 132 133 Fizz 135 136 Fizz Buzz 139 Fizz 141 142 FizzBuzz 144 145 Fizz 147 Fizz Fizz 150 |

Well, it doesn’t work *as* well, but still beats picking the output by random and in my books, that’s a win! You can download the whole code from here and run it with `python fizzbuzz.py`

.