Fizz buzz

Most developers know Fizz buzz. It’s a very common recruiter coding test problem, where you go through numbers and if a number is divisible by 3, you print “Fizz”, if it’s divisible by 5, you print “Buzz” and if it’s divisible by both, you print “FizzBuzz”.

A typical output looks like this (from 1 to 15):

1
1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, FizzBuzz

Implementing Fizz buzz in Tensorflow

You can split AI development in to 3 steps: gathering data, creating the model and training. A key to a great neural network is the data so let’s start by gathering some training data!

Generating the data

Since neural network can only input and output floating point numbers between -1 and 1 (or 0 to 1), we need to improvise a bit to get the data to a neural network. Let’s input the numbers in binary and use 8 binary neurons to input it. That is enough to allow us to train the network from 0 to 2^8 (0-256), which is plenty for this example.

1
2
def encode_input(num):
return [int(i) for i in tuple(bin(num)[2:].zfill(8))]

For the output, we have 4 different options: Fizz, Buzz and FizzBuzz and the number itself. We can use 4 binary outputs for this.

1
2
3
4
5
6
7
8
9
10
11
12
13
def encode_fizzbuzz(num):
if num % 3 == 0 and num % 5 == 0:
# Fizzbuzz
return [1, 0, 0, 0]
elif num % 3 == 0:
# Fizz
return [0, 1, 0, 0]
elif num % 5 == 0:
# Buzz
return [0, 0, 1, 0]
else:
# Number
return [0, 0, 0, 1]

Since we already have encode_input() and encode_fizzbuzz() functions implemented, we can just use a simple loop to generate the data.

1
2
3
4
5
x = []
y = []
for i in range(101):
x.append(encode_input(i))
y.append(encode_fizzbuzz(i))

Now x contains the input data and y the expected output.

Creating the model

As usual, the input shape is set to what we use as a input, so input_dim=8 represents the binary input we’re giving the network. Since this is a classification problem, the last layer’s activation function should be softmax. This sets the sum of all outputs to 1, for example output [1, 2, 3, 4] would be [0.0320586 , 0.08714432, 0.23688282, 0.64391426]. The last layer size is also set to 4, because we have 4 possible outputs.

1
2
3
4
5
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, input_dim=8, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(4, activation='softmax')
])

Now we just compile the model. We’re using categorical_crossentropy as a loss function for this classification network.

1
2
model.compile(tf.keras.optimizers.Adam(learning_rate=0.001),
loss='categorical_crossentropy')

Training the model

Next we simply call the model.fit() method to train the network. Setting verbose=2 limits the output to a single line per epochs and shuffle=True shuffles the training data to get a bit better results. 1000 epochs seemed to work pretty well for 100 number Fizz Buzz.

1
2
model.fit(x=x, y=y,
verbose=2, shuffle=True, epochs=1000)

Output

Since binary output is pretty boring and not really human readable, now we just need to parse the output from the network. We could use numpy.argmax() here too, but I like to add as little external libs as possible.

1
2
def decode_fizzbuzz(result, num):
return ['FizzBuzz', 'Fizz', 'Buzz', num][max(range(len(result)), key=lambda x: result[x])]

Now we can try our model. Since the training was done for numbers 1-100, let’s just use the same range to test it with!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
correct = 0
wrong = 0
for i in range(1, 101):
result = model.predict([num_to_input(i)])
output = decode_fizzbuzz(result[0], i)
print(output, end=' ')
if output == decode_fizzbuzz(encode_fizzbuzz(i), i):
correct += 1
else:
wrong += 1

print('')
print('Total correct:', correct)
print('Wrong:', wrong)
print('Correct percentage:', correct / (correct + wrong) * 100, '%')

Now, finally, we can run the script and see if it works!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ python fizzbuzz.py
2020-03-08 21:00:08.275778: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-03-08 21:00:10.162458: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll

# ...snip...

Train on 100 samples
Epoch 1/1000
100/100 - 0s - loss: 1.3081
Epoch 2/1000
100/100 - 0s - loss: 1.2411
Epoch 3/1000
100/100 - 0s - loss: 1.2023
Epoch 4/1000
100/100 - 0s - loss: 1.1741
Epoch 5/1000

# ...snip...

Epoch 1000/1000
100/100 - 0s - loss: 5.6846e-04

1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz 16 17 Fizz 19 Buzz Fizz 22 23 Fizz Buzz 26 Fizz 28 29 FizzBuzz 31 32 Fizz 34 Buzz Fizz 37 38 Fizz Buzz 41 Fizz 43 44 FizzBuzz 46 47 Fizz 49 Buzz Fizz 52 53 Fizz Buzz 56 Fizz 58 59 FizzBuzz 61 62 Fizz 64 Buzz Fizz 67 68 Fizz Buzz 71 Fizz 73 74 FizzBuzz 76 77 Fizz 79 Buzz Fizz 82 83 Fizz Buzz 86 Fizz 88 89 FizzBuzz 91 92 Fizz 94 Buzz Fizz 97 98 Fizz 100
Total correct: 99
Wrong: 1
Correct percentage: 99.0 %

Wow, only 1 wrong answer (it was for the number 100)! This is pretty promising.

Conclusion

Yes, it will AI! This is super inefficient, but gets really accurate after 1000 epochs.

The true power of neural networks is that it also works with the data it has never seen before. You might be interested how this works after 1-100 so let’s try it out with numbers 100-150:

1
2
3
4
100 101 Fizz 103 104 105 106 107 Fizz 109 110 111 112 113 Fizz Fizz 116 Fizz Fizz Fizz Fizz 121 122 123 124 125 Fizz 127 FizzBuzz 129 130 Fizz 132 133 Fizz 135 136 Fizz Buzz 139 Fizz 141 142 FizzBuzz 144 145 Fizz 147 Fizz Fizz 150
Total correct: 22
Wrong: 29
Correct percentage: 43.13725490196079 %

Well, it doesn’t work as well, but still beats picking the output by random and in my books, that’s a win! You can download the whole code from here and run it with python fizzbuzz.py.