Apply Deep Neural Network on the Animal Faces dataset

kshem
5 min readJun 29, 2020
animal faces

‘Machine learn to classify animals using their images’

Introduction

I have spent the last month trying to solidify my knowledge of PyTorch framework with FreeCodeCamp and Jovian.ml and their course “Deep Learning with PyTorch: Zero to GANs”. The final step in the course was an independent project in all parts to deep learning of cause. I have chosen animal faces dataset (animal classification).

Data Characteristics

This dataset, also known as Animal Faces-HQ (AFHQ), consists of 16,130 high-quality images at 512×512 resolution.
There are three domains of classes, each providing about 5,000 images. By having multiple (three) domains and diverse images of various breeds per domain, AFHQ sets a challenging image-to-image translation problem. The classes are:

  • Cat
  • Dog
  • Wildlife
sample image from the dataset

For this project, we will use 10% of the dataset as the validation set and 90% as the training set. The loss function will be cross-entropy loss since this is a classification problem. The optimizer will be stochastic gradient descent and the batch size for gradient descent will be 20. Stochastic gradient descent is an approximation of gradient descent. The gradient of the loss function is applied to a batch of all the training points instead of the whole set, which is much faster to compute. This stochastic batch sampling of training samples introduces a lot of noise, which is actually helpful in preventing the algorithm from getting stuck in narrow local minima.

Animal ,faces dataset

The dataset is extracted to the directory animal-faces. It contains 2 folders (train and val), containing the training set (14,630 images) with 3 classes and val set (14,630) with 3 classes respectively.

Let’s see the train dataset:

Now, let’s take a look at the validation dataset:

Test set

Test set is used to compare different models, or different types of modeling approaches, and report the final accuracy of the model. Since there’s no predefined test set, we can set aside a small portion (5,000 images) to be used as the test set. We’ll use the random_split helper method from PyTorch to do this. To ensure that we always create the same test set, we’ll also set a seed for the random number generator:

We also check the image shape and label of the given dataset.

Training and Validation Dataset

Now, we save the dataset into a variable using data loaders for training and validation, to load the data in batches. Then, we set the batch size to 20:

Preparation for model training

This dataset consists of 16,130 high-quality images at 512×512 resolution. In 3 classes, each providing about 5,000 images. There are 1,500 validation images and 14,630 train images. Since we are working with coloured images, our data will consist of numeric values that will be split based on the RGB scale.

Base Model and Training on GPU

First, we create the base model for our neural network where we will define functions for the training process and validation process.

Then we will define the evaluate function to return the progress of our model after each epoch and the fit function which will be used to update the weights for each epoch:

Thanks to PyTorch, we can use the GPU for training and evaluating our model. GPUs are much more efficient for updating and calculating weights, mostly if the data we’re dealing with are images or videos, which is in our case. So, we will be moving our data to the GPU because it is available, but if you don’t have it you could use your CPU as usual but it may get slow.

First we Check if the GPU is available:

Now let’s define the helper function for moving data into GPU:

Next, we move the data into GPU:

Training the Model

First we define input size of our image dataset which is 512*512 pixel and we set the output which should be equal to the output classes you have:

Neural network:

Our Neural Network is an input layer with 3 hidden layers using RELU as an activation function:

Having the functions we can start training. Remember to initiate the model in the GPU. Then we put the values that will use.

First, let us see how our function can perform before it’s trained:

So, we are having an accuracy of 33.3% before the model is trained which is good. Let’s train the model and see if there are any improvements.

We train the model using the fit function to reduce the validation loss and improve accuracy by setting the number of epoch we need to train and the learning rate:

It seems like 0.0001 is a good learning rate so we better keep it this way.

Excellent! After 7 epochs we’re getting an accuracy of 60%, isn’t that good? Let’s increase the learning rate to 0.001 and see what happens:

It looks like increasing the learning rate by a small portion provides better results:

If you want, you can continue adjusting this model and try different approach and see if you can get better result. Here, we are getting 81% and it’s good!

Evaluating the model:

We plot of the losses and accuracies and evaluate the first model on the test set:

On the chart, the accuracy increased dramatically up to 81%-85% and started to be firm.

Now, let’s look at the history chart:

Predictions:

Let us now do some predictions using the Test dataset we created and see how the model is doing on real objects:

Wow! The model is doing very good, indeed! As you can see the model is very efficient on real images and has predicted the image just fine.

Excellent!

Very good!

For an accuracy of 81.3% our model is very efficient. It could predict any animal from the dataset!

As you can see, it is easy to define and train a model. Take note that it is important to choose the hyperparameters carefully for best results. However, the hardest part here is choosing the right values for your model to train. Anyway, I hope this article helps you learn!.

The training step took a lot of time probably due to the image pixels (512px), which was huge to fit in the memory.

We can use CNN in the future to see if it could bring better result.

Notebook used for this article:
https://jovian.ml/tonnyshm/nn-animal-faces-classification

References

3. https://towardsdatascience.com/finding-good-learning-rate-and-the-one-cycle-policy-7159fe1db5d6

4. https://jovian.ml/tonnyshm/03-cifar10-feedforward

--

--

kshem

*proudly🇷🇼. AI enthusiast ▶️In python, we trust♂️