Pic Credit : IndiaMART

How To Get The Best Quality Red Wine (Using PyTorch and ML)

Nandan Pandey

--

Hey everyone, I know that the title is quite interesting and that’s why you are here. But believe me the concept of finding the finest quality red wine using ML and PyTorch is also very interesting.

Let’s move one step further. The dataset for wine quality has been taken from here.

Although we can solve it using any framework but here our aim is to learn PyTorch.

  1. To start off, import all important libraries.

2. Next, load the dataset that is csv file from disk to memory.

3. Check if there exists any null value in dataset.

All columns returned False that means no column contains null/NAN value.

4. Check correlation between features via correlation matrix : There are many methods to find correlation but here I prefer to create a correlation matrix and to visualise it as there are not too many features.

I have considered high correlation if >0.7 as I am too lazy to reduce this threshold and to perform the rather bulky pre-proecessing especially since here our aim is to learn PyTorch, not to do feature engineering. So let’s move on.

5. Separate the input and target/output columns.

6. Convert dataframe to NumPy arrays.

7. Convert NumPy array to torch tensor.

8. Now convert torch tensor to Tensor Dataset.

9. Split dataset into train, validation and test data set. Please note that here I am splitting it in only train and validation set. Feel free to split in three parts and experiment with it.

One per cent of total data is stored as validation data.

10. Choose any batch size. If you want you can change it and experiment with it.

Use DataLoader class because it yields data as a batch in every iteration.

11. Now get your input and output size.

So that weight can be initialised of specified size.

12. Next, create WineModel class as a skeleton of your model:

WineModel inherits nn.Module class. I have considered l1_loss here. Feel free to experiment with other loss functions and see how they performs.

13. Create an object of WineModel class:

14. Create fit and evaluate method for training and validation.

15. It’s time to train your model with some learning rate and for some number of epochs. Feel free to change these hyperparameters and then train.

16. I have repeated the step written above four times with different learning rate and same number of epochs such as lr = 1e-3,1e-4,1e-5,1e-6.

17. Now our training is done. So, calculate the overall validation loss to get an idea of how well our model will perform.

18. Now it’s time to predict. Before moving forward, I want to remind you all again that prediction should be on test data set. Taking this into consideration, let’s move forward. Create a method that will do prediction.

I am predicting on validation set (You should try on test set).

Unsqueeze your input. It is an important step to note so that dimension can be matched.

Wow…amazing! Almost close prediction. From today, you can choose the finest quality red wine using this model and have fun!

But this is not the case always. There are lot of steps involved in complex datasets that we shall see further. As a beginner it’s important to understand PyTorch’s basic functionalities to deal with data and the workflow of machine learning.

So stay tuned to work on some complex datasets and to deep dive, till then happy learning!

Special Thanks to Akash N S sir, founder of Jovian.ml

Kaggle Notebook Link

Contact Me

LinkedIn

--

--