Train your own model with Keras

Download this notebook

What is Keras ?

Keras was published in March 2015 by François Chollet to simplify the development of deep learning models. The standalone Python library later became the official high-level API of TensorFlow, which led to integration with its ecosystem. Since then, Keras has been used in a variety of applications in the field of deep learning. It is characterized above all as a user-friendly API for rapid prototype development. In addition to supporting TensorFlow, other backend engines such as Theano and the commercial Microsoft Cognitive Toolkit have also been supported.

Keras 3.0

On 28.11.2023, Keras 3.0 was released for the backend engines JAX, TensorFlow and PyTorch. The multi-framework deep learning API is characterized above all by a backend-independent structure without having to make changes to the code. The possibility of dynamic adaptation ensures that models can always be trained with the highest achievable efficiency.

Keras 3.0 models can be seamlessly integrated with PyTorch, TensorFlow and JAX. In addition, all models can be exported without framework preference and used by other users. Keras 3.0 uses various data sources (tf.data.Dataset, PyTorch, DataLoader, NumPy arrays and Pandas), which improves training across frameworks and data types.

Keras 3 Car Price Prediction

In the first section of the following example, we want to go back to the example from the previous chapters Train your own model with TensorFlow and Introduction to Torch. It is therefore advisable to work through these chapters beforehand for your own understanding and to make it easier to set up Keras 3.0.

Keras with JAX Backend

In the previous notebooks we have so far got to know the module: TensorFlow. However, TensorFlow is not a dedicated Python package, but is written in the C programming language. This means that it has to be recompiled for each Python version. To avoid compatibility problems with Python 3.12, we use Python version 3.11.7. It would therefore be advisable to adapt the programming environment NOW. This does not mean that the existing Python version should be deleted. If possible, it is advisable to set up a virtual environment in, for example, Visual Studies. We use JAX as the backend for Keras 3.0 in the following sections.

Backend configuration

Setting up the backend](https://keras.io/getting_started/) must be done before installing Keras 3.0. To do this, we open the terminal and install the package JAX (pip install jax), TensorFlow or PyTorch using the familiar command pip. We then install the latest Keras version with pip install keras. The reason for this is that TensorFlow is bound to Keras 2, which has been removed as of version 2.16 of TensorFlow.

It is also recommended to update the following modules and possibly also Keras:

KerasCV: Computer Vision Workflow with pip install –upgrade keras-cv

KerasNLP: Natural Language Workflows with –upgrade keras-nlp

Keras with –upgrade keras

The installation and setup of the new backend should now be complete. If this is not the case, Keras 3 should be reinstalled.

pip uninstall Keras pip install Keras

import os
import jax 
os.environ["KERAS_BACKEND"] = "jax"
import keras
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder, StandardScaler
from keras.models import Sequential
from keras.layers import Dense, Input
from keras import ops #load model

Loading data

As in the previous examples, we use the read_csv method from the Python package Pandas to read in our data. However, if we compare the preparation of data in the other chapters, this is significantly shorter.

With data.shape we can output the dimensions or the shape of the data. In our example, the data set contains 15157 rows (cars) and nine columns (characteristics).

data = pd.read_csv('../data/vw.csv')

data.shape

We can also simply output the table. The first line contains the headings of the nine columns

data.head() print(data)

Define features and target variable

In the following section, the features are extracted from the data and stored in the variable “X”. The price as the target variable is stored in the variable “Y”. This is important for the preparation and therefore differs from other examples in that a conversion (example: TensorFlow) or adaptation to a tensor (example: PyTorch) is not required.

X = data[['model',
          'year',
          'transmission',
          'mileage',
          'fuelType',
          'tax',
          'mpg',
          'engineSize']]
y = data['price']

Coding of categorical variables

In addition to numerical values (e.g. 1.2 engineSize), the data table also contains categorical characteristics (model, transmission type, fuel), which correspond to qualitative characteristics. These must be converted into discrete classes (e.g. 1, 2, 3) for readability.

With pd.get_dummies(X) the categorical characteristics can be converted into dummy variables. This is also known as one-hot encoding or 1-of-n code. For each categorical variable, get_dummies() creates new columns in the DataFrame, with each column corresponding to a unique category. If a row corresponds to the original categorical variable, the value in the corresponding dummy column is 1, otherwise it is 0. This method is useful when there is no natural order between the categories, such as different car types.

X = pd.get_dummies(X)

Davon unterscheidet sich label_encoder(), welcher kategorialen Variablen mit natürlicher Reihenfolge z.B. verschiedenen Getriebetypen (manuell, automatisch usw.) eindeutigen Ganzzahlenwerten zuweist.

label_encoders = {}
for column in ['model','transmission', 'fuelType']:
    label_encoders[column] = LabelEncoder()
    
# Split dataset into train and test data
X_train, X_test, y_train, y_test = train_test_split(
    X, 
    y, 
    test_size=0.2, 
    random_state=42
    )
#Standardize numerical featurs
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Creation and training of the model

In the following sections, the Keras model for predicting the car price is defined. For this purpose, input_shape = X_train_scaled.shape[1] is used to specify the number of columns used in the scaled training data set. Furthermore, the sequential Keras model is deined with model = Sequential([]). This is a linear stacking of layers.

The first layer (input layer) of the model is the input layer, in which the previously used input_shape is used. In the following dense layers, a number of 64 neurons and the activation function activation= relu (ReLu = Rectified Linear Unit) are defined. The ReLu function is a non-linear activation function that is frequently used in neural networks. The last layer Dense(1) defines the output layer with one neuron. This layer does not require an activation function, as it wants to predict a continuous value. A summary of the model is then output with model.summary().

# Define the Keras model

input_shape = X_train_scaled.shape[1]

model = Sequential([
    Input(shape=(input_shape,)),
    Dense(64, activation='relu'),  # Adjusted input shape
    Dense(64, activation='relu'),
    Dense(1)  # Output layer (single neuron for regression)
])

model.summary()
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_3 (Dense)                 │ (None, 64)             │         2,560 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 64)             │         4,160 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense)                 │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 6,785 (26.50 KB)
 Trainable params: 6,785 (26.50 KB)
 Non-trainable params: 0 (0.00 B)
# Compile model
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])
# Train model
model.fit(X_train_scaled, y_train, epochs=10, batch_size=9, validation_split=0.2)
# Evaluate model
loss, mae = model.evaluate(X_test_scaled, y_test)
print(f'Test Mean Absolute Error: {mae}')
# Make predictions
predictions = model.predict(X_test_scaled)

To save the evaluated model, we use model.save(name). This differs in Keras 3 from, for example, Torch, where we had to add all our encoders and scalers separately to the save process. So that we can tell the German and English versions apart, we simply add a 2 to the name model.save("../models/car_price_prediction_model_2.keras").

model.save("../models/car_price_prediction_model_2.keras")

Use of the model

The model is loaded in Keras 3 simply by model = keras.models.load_model(name). The values we have already used in the other notebooks are also defined.

model = keras.models.load_model("../models/car_price_prediction_model_2.keras")
# Define the input values
input_data = {
    'model': ['T-Roc'],
    'year': [2019],
    'transmission': ['manual'],
    'mileage': [10000],
    'fuelType': ['Petrol'],
    'tax': [145],
    'mpg': [49.6],
    'engineSize': [1.5]
}
input_df = pd.DataFrame(input_data)
print(input_df)
for column in ['model', 'transmission', 'fuelType']:
    input_df[column] = label_encoders[column].fit_transform(
        input_df[column]
        )

Price forecast

# Make prediction
predicted_price = model.predict(X_test_scaled)[1][0]
print("Predicted price:", predicted_price)

The price of the prediction is slightly lower than the other models. However, a check on comparison portals tells us that the price for a model with this motor line is realistic.