Animal Slaughtering Image Classifier

4 min readJul 1, 2020

It was just another day and I thought why not make a browser extension that hides graphic content. Yeah yeah, you might be thinking I need an image classifier first, in order to do that. So this is the first part of that session. In this part, I’ll be explaining how I used transfer learning and images scrapped from internet to build an image classifier.

P.S: I am not an expert in machine learning but I am a good learner and I’ll share all the knowledge I have learnt.

Let’s begin by importing python modules:

Exploring the data

For the dataset collection, I have scraped images from the internet using Python requests or selenium modules in colab and saved everything in a folder in my google drive. I’ll not be covering that portion in this blog. But I’m sure you can find how to do it easily on stackoverflow.

Now, let’s load the image folder and transform data to Pytorch tensors.

This is how my images looks. Images are of arbitrary shape.

Training and Validation Datasets

For image classification problems, the model performs better once we perform channel wise normalization. So, there are a series of basic steps. First we resize the image to 256,256 and perform a RandomCrop with reflect mode. This allows random image crop on each epoch of training. Besides I’ve implemented random horizontal flip and random rotation to 10 degrees. Finally, the 3 channel tensor is then normalized channel wise with mean of 0.5 and standard deviation of 0.5:

So at this point, I got a little lazy. I was actually required to make one train set and another test set. Instead of copying things in google drive, I had an easier way. I generated 400 random indices and made a data subset for test_dataset and rest for train_dataset. Not a good practice but yeah, as Bill Gates said, “I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it.” :D

Now, let us visualize how our dataset looked:

Umm, So far, so good. Now, its time to set up our models:

Setting up model

I’ll be using Residual Network provided by torchvision models in order to train my data. We will be doing that in GPU so we will require few additional setup like migrating our image tensors and models both to our cuda device.

YES! FREE GPU! LONG LIVE GOOGLE COLAB! :D

Actually, I’ve defined 2 accuracy models. One used during training and this one for final accuracy.

Here comes the important part, I’ll just explain the __init__ and freeze functions here, rest are already explained in my previous blogs. Do check them out guys!

So the network users transfer learning from existing resnet34 model. The CNN model is well trained and has the ability to learn patterns from image. The next part is ANN that gives a final classification for the image. The existing model comprises Batch Normalization concept which increases efficiency of the model.

The freeze function disables the gradient for tensors in resnet34 model. But, the ANN network that we create will have require_grad=True allowing it to be the trainable part of our model.

Training the Model

First we will define some helper functions to train our model and fit function also. Here we will use the concept of Learning Rate Scheduling, Weight decay and Gradient Clipping in order to maximize the performance of the model.

Initially the model has around 50% accuracy. i.e. it randomly assigns value to each image if it contains graphic content or not.

Now, lets freeze the ResNet parameters and focus on training the ANN network we build. This is the concept of transfer learning. As we do not need to train the Feature Extraction (CNN) part of the system as it has already been trained on large dataset and have learned to pick important features in the image.

Now lets finally start our training process. I have trained for longer epochs in my project, like around 30.

Graphs and Plots

Graphs and plots helps visualize our progress. Here the accuracy, training loss and validation loss that we have previously recorded while training is plotted.

Testing with Individual Images

Finally, lets see how image gets classified on our system. I’ve used the images from test_dataset individually to see the predicted score. More the score closer to 1, more it means the image have graphical content; and vice-versa.

Final Score

Now the final score we all been waiting for. The system classifies Graphic content if predicted score > 0.5. Based on that, our test_dataset get a score of 78.91%.

Conclusion

The score was pretty much satisfying for me. There were lot of random images in the slaughter category. Improving those factors and model could make the system more effective. But as I said earlier, I am not an expert. So, in future, I’ll be using more effective techniques to make my model better and better. :D

Thank you so much guys to read it till the end. The Python notebook is in the jovian.ml link. I’ve uploaded the state_dict of my model here. You can download it, restore and try it on your own dataset. Finally, thanks to zerotogans.com.