Zerotogans series 3: a simplified foolproof approach

Cassie Guo
3 min readJun 9, 2020
image source

Following the progress on the previous post, this time I want to try to build a deep learning model on more complex form of data: images. To teach anyone to code a deep neural network, I will demonstrate it in by answering this question (see this thread):

what are the three steps to put an elephant into the fridge..?

The task at hand is to predict multiple classes on the type of the object. See below for an example of CIFAR10 dataset. (the classes: 'airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship',
'truck' )

Step1: Build a fridge (which can train models)

For any deep learning/machine learning task, the best practices is to use object-oriented programming; having a framework with essential procedures and variables that you can tweak will liberate you from writing the same code for multiple times.

To make the code easy to use, we will try to build a framework and experiment with different hyperparameters.

Just like I mentioned in the previous post, we will need:

  • CFAR10Model() class;
  • fit() method;
  • validate() method;

The only difference is we will need multiple layers to construct a deep neural net.

Here in the constructors, we added another linear function after the first one. Meanwhile in the forward() function, we added an activation function. Why do we need the activation function? It introduces non-linearity to help to learn more complex relationship between input and target. The function we used here is leaky relu. It looks like this:

left: ReLu. Right: leaky ReLu. (image source)

You might also noticed the ImageClassificationBase in the code, this is an extension of the base class as below:

This way of design helped us clearly differentiate the modifiable part (model architecture) and the unmodifiable part (fit, validation method).

Last but not least, we will have a series of functions to transfer the data to GPU so that we can supercharge the training process!

Step 2: Put the elephant (images) in the fridge (model)

To put the elephant (images) into the fridge, first we will have to do some cleaning and loading.

Now we just need to train the model by instantiate the model class and run the fit method.

Step 3: Close the fridge

Because we have a well-designed pattern, we only need to change a few things to reuse the code and do a series of experiment to test:

  • What is the best hidden size?
  • What is the best activation function?
  • What is the best optimizer?

Hidden size: we tried quite a few different hidden sizes. The result concluded that higher hidden sizes often lead to better performance since more hidden sizes help to retain the information from the previous layer.

Comparison of different hidden size for relu, SGD optimizer

Activation function: we tried relu, leakyRelu and Tahn. It seems that leaky relu has the best performance given the same learning rate and epochs.

Optimizer: we tried SGD, Adam and Adamax, the training plot is as below. It shows that SGD is relatively stable, whereas Adam and Adamax have a period of oscillation.

There you have it! Within three steps, we built a simple three-layered neural network to train a CIFAR10 image classifier. In the next post, we will keep improve this classifier by adding convolutional layers. Stay tuned!

--

--

Cassie Guo

Data Scientist; write about data culinary (data: ingredients 🍗🥬🧂🍅; model: recipe 📖; results: delicious dish🥘) and other shenanigans