Train your own model with PyTorch

Download this notebook

This part is analogous to Train your own model with TensorFlow. If your own problem has already been solved satisfactorily with a PhotonAI model, you can skip this chapter. On the other hand, the creation of a model is dealt with even more briefly in this chapter and it may well make sense to read the previous chapter first. Apart from that, this is also not an introduction to PyTorch or the creation of machine learning models in general. Rather, the aim is merely to create a simple model that will serve as an example in the deployment and publication process in the following parts. The PyTorch Tutorials will help you learn more about creating machine learning models with PyTorch.

Table of contents

What is PyTorch?

PyTorch

PyTorch is an open-source deep learning framework that is particularly known for its flexible and dynamic computational graph structure. It facilitates the training of neural networks through an intuitive interface and offers extensive support for multidimensional data processing. PyTorch allows developers to deploy models to various platforms, including mobile devices. The framework offers a wide range of functions and modules to meet individual requirements.

For Python, PyTorch can be installed directly in the terminal via pip:

pip install torch

import pickle
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OrdinalEncoder, StandardScaler

import torch
import torch.nn as nn
import torch.optim as optim

Preparing the data

Firstly, we define our data and project structure again. In this case, we are in the /incubaitor/1_Frameworks/1_3_Torch/notebooks/ folder with the notebook. Relative to this, we have now loaded the data (see previous chapter) into the folder /incubaitor/1_Frameworks/1_3_Torch/data/ and will save our models in /incubaitor/1_3_Torch/1_Frameworks/models/. This allows us to use the relative paths ../data/ and ../models/. If there are problems here, explicit paths can also be used as in the PhotonAI example.

After we have imported all libraries and functions, we first prepare our data analogous to the TensorFlow tutorial. It is important here that we pack our data into the data structure torch.tensor(), with which PyTorch can perform all calculations efficiently.

def str_to_category(string):
    """
    Converts a string to a category.
    :param string: Input string.
    :return: Category.
    """
    return string.strip(' \t\n').lower().replace(' ', '_').replace('-', '')

# Load data and split into labels and features
data = pd.read_csv('../data/vw.csv', converters={
    'model': str_to_category,
    'transmission': str_to_category,
    'fuelType': str_to_category,
})
label = data.pop('price')

# Encode categorical data
model_enc = OrdinalEncoder()
data['model'] = model_enc.fit_transform(data[['model']])
transmission_enc = OrdinalEncoder()
data['transmission'] = transmission_enc.fit_transform(data[['transmission']])
fuelType_enc = OrdinalEncoder()
data['fuelType'] = fuelType_enc.fit_transform(data[['fuelType']])

# Split data into training and test sets
train_data, test_data, train_label, test_label = train_test_split(
    data, 
    label, 
    test_size=0.2, 
    random_state=42
    )

# Normalize data
data_scaler = StandardScaler()
train_data = torch.tensor(
    data_scaler.fit_transform(train_data), 
    dtype=torch.float32
    ) 
test_data = torch.tensor(
    data_scaler.transform(test_data), 
    dtype=torch.float32
    ) 

# Normalize labels
label_scaler = StandardScaler()
train_label = torch.tensor(
    label_scaler.fit_transform(
        train_label.values.reshape(-1, 1)
        ), 
    dtype=torch.float32) 
test_label = torch.tensor(
    label_scaler.transform(
        test_label.values.reshape(-1, 1)
        ), 
        dtype=torch.float32) 

Creation and training of the model

Similar to TensorFlow, we use the container nn.Sequential, with which we create a simple multilayer perceptron (MLP) with two hidden layers of size 64 and with ReLU activation:

# Create the model
model = nn.Sequential(
    nn.Linear(train_data.shape[1], 64), 
    nn.ReLU(),  
    nn.Linear(64, 64),  
    nn.ReLU(),  
    nn.Linear(64, 1)  
)

Unlike TensorFlow, PyTorch does not have the compile() and fit() functions to train the MLP. Instead, we have to implement the training ourselves with the train_model() function. We also specify a loss function and an optimizer, as well as the number of epochs and the size of our batches. During training, we are regularly informed about the training progress in the console. This allows us to estimate how long the training will take and to check that the loss is actually decreasing and that our model is converging.

# Define the loss function as the Mean Squared Error
criterion = nn.MSELoss()

# Define the optimizer (Adam)
optimizer = optim.Adam(model.parameters())

# Define function for training the model
def train_model(model, 
                criterion, 
                optimizer, 
                train_data, 
                train_label, 
                epochs, 
                batch_size
                ):
    
    model.train()
    for epoch in range(epochs):
        for i in range(0, len(train_data), batch_size):
            batch_data = train_data[i:i+batch_size]
            batch_label = train_label[i:i+batch_size]

            optimizer.zero_grad()
            output = model(batch_data)
            loss = criterion(output, batch_label)
            loss.backward()
            optimizer.step()
        print(f'Epoch: {epoch+1}/{epochs}, loss: {loss.item()}')

# Train the model
train_model(model, 
            criterion, 
            optimizer, 
            train_data, 
            train_label, 
            epochs=50, 
            batch_size=64
            )

Evaluation and storage

To evaluate the model, we again write a function evaluate_model(), which outputs the test loss in two different metrics. We also want to visualize our results again and do this analogously to the TensorFlow tutorial.

# Define function for evaluating the model
def evaluate_model(model, criterion, test_data, test_label):
    model.eval()
    with torch.no_grad():
        output = model(test_data)
        test_loss = criterion(output, test_label)
        mae = torch.mean(torch.abs(output - test_label))
    return test_loss, mae

# Evaluate the model
test_loss, mae = evaluate_model(model, criterion, test_data, test_label)
print(f'Test Loss: {test_loss}, Mean Absolute Error: {mae}')

# Plot some predictions
predictions = label_scaler.inverse_transform(
    model(test_data).detach().numpy()
    ) 
test_label = label_scaler.inverse_transform(test_label.numpy()) 
plt.scatter(test_label, predictions, s=0.1)
plt.plot([0, test_label.max()], [0, test_label.max()], '--', color='red')
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.show()

To save the model, we can simply use torch.save(), which saves the model at the desired path. We also save all our encoders and scalers again so that we can prepare our input data correctly.

# Save the model, encoder and scaler
Path("../models/").mkdir(parents=True, exist_ok=True)
torch.save(model, '../models/model.pt') 
pickle.dump(model_enc, open('../models/model_enc.pkl', 'wb'))
pickle.dump(transmission_enc, open('../models/transmission_enc.pkl', 'wb'))
pickle.dump(fuelType_enc, open('../models/fuelType_enc.pkl', 'wb'))
pickle.dump(data_scaler, open('../models/data_scaler.pkl', 'wb'))
pickle.dump(label_scaler, open('../models/label_scaler.pkl', 'wb'))

Using the model

To use the model for prediction, we load it with torch.load() and create a DataFrame in which we store sample data. Again, make sure that the data is wrapped in torch.tensor().

# Load model, encoder, and scaler
model = torch.load("../models/model.pt") 
model_enc = pickle.load(open('../models/model_enc.pkl', 'rb'))
transmission_enc = pickle.load(open('../models/transmission_enc.pkl', 'rb'))
fuelType_enc = pickle.load(open('../models/fuelType_enc.pkl', 'rb'))
data_scaler = pickle.load(open('../models/data_scaler.pkl', 'rb'))
label_scaler = pickle.load(open('../models/label_scaler.pkl', 'rb'))


# Load and prepare test data
dummy_data = pd.DataFrame({
        "model": [str_to_category("T-Roc")],
        "year": [2019],
        "transmission": [str_to_category("Manual")],
        "mileage": [12132],
        "fuelType": [str_to_category("Petrol")],
        "tax": [145],
        "mpg": [42.7],
        "engineSize": [2.0],
    })
dummy_data.loc[:, "model"] = model_enc.transform(
    dummy_data.loc[:, ["model"]]
    )
dummy_data.loc[:, "transmission"] = transmission_enc.transform(
    dummy_data.loc[:, ["transmission"]]
    )
dummy_data.loc[:, "fuelType"] = fuelType_enc.transform(
    dummy_data.loc[:, ["fuelType"]]
    )
dummy_data = data_scaler.transform(dummy_data)

# Convert the dummy_data to PyTorch Tensor
dummy_data = torch.tensor(dummy_data, dtype=torch.float32)

Finally, we can predict the price of the example car:

# Predict
with torch.no_grad():
    result = model(dummy_data)
    result = label_scaler.inverse_transform(result.numpy())[0, 0]
    print(result)

The result is about 26788.51, which is close to the result of the PhotonAI model and the TensorFlow model and also corresponds to a realistic price.

We were therefore able to design, train and now even use our own model with PyTorch. Although this required significantly more manual steps, which were previously taken away from us by the PhotonAI Hyperpipe, we were also able to use categorical features for our prediction.