Train your own model with Keras
What is Keras ?
Keras was published in March 2015 by François Chollet to simplify the development of deep learning models. The standalone Python library later became the official high-level API of TensorFlow, which led to integration with its ecosystem. Since then, Keras has been used in a variety of applications in the field of deep learning. It is characterized above all as a user-friendly API for rapid prototype development. In addition to supporting TensorFlow, other backend engines such as Theano and the commercial Microsoft Cognitive Toolkit have also been supported.
Keras 3.0
On 28.11.2023, Keras 3.0 was released for the backend engines JAX, TensorFlow and PyTorch. The multi-framework deep learning API is characterized above all by a backend-independent structure without having to make changes to the code. The possibility of dynamic adaptation ensures that models can always be trained with the highest achievable efficiency.
Keras 3.0 models can be seamlessly integrated with PyTorch, TensorFlow and JAX. In addition, all models can be exported without framework preference and used by other users. Keras 3.0 uses various data sources (tf.data.Dataset, PyTorch, DataLoader, NumPy arrays and Pandas), which improves training across frameworks and data types.
Keras 3 Car Price Prediction
In the first section of the following example, we want to go back to the example from the previous chapters Train your own model with TensorFlow and Introduction to Torch. It is therefore advisable to work through these chapters beforehand for your own understanding and to make it easier to set up Keras 3.0.
Keras with JAX Backend
In the previous notebooks we have so far got to know the module: TensorFlow. However, TensorFlow is not a dedicated Python package, but is written in the C programming language. This means that it has to be recompiled for each Python version. To avoid compatibility problems with Python 3.12, we use Python version 3.11.7. It would therefore be advisable to adapt the programming environment NOW. This does not mean that the existing Python version should be deleted. If possible, it is advisable to set up a virtual environment in, for example, Visual Studies. We use JAX as the backend for Keras 3.0 in the following sections.
Backend configuration
Setting up the backend](https://keras.io/getting_started/) must be done before installing Keras 3.0. To do this, we open the terminal and install the package JAX (pip install jax), TensorFlow or PyTorch using the familiar command pip. We then install the latest Keras version with pip install keras. The reason for this is that TensorFlow is bound to Keras 2, which has been removed as of version 2.16 of TensorFlow.
It is also recommended to update the following modules and possibly also Keras:
KerasCV: Computer Vision Workflow with pip install –upgrade keras-cv
KerasNLP: Natural Language Workflows with –upgrade keras-nlp
Keras with –upgrade keras
The installation and setup of the new backend should now be complete. If this is not the case, Keras 3 should be reinstalled.
pip uninstall Keras pip install Keras
Loading data
As in the previous examples, we use the read_csv
method from the Python package Pandas to read in our data. However, if we compare the preparation of data in the other chapters, this is significantly shorter.
With data.shape
we can output the dimensions or the shape of the data. In our example, the data set contains 15157 rows (cars) and nine columns (characteristics).
We can also simply output the table. The first line contains the headings of the nine columns
data.head() print(data)
Define features and target variable
In the following section, the features are extracted from the data and stored in the variable “X”. The price as the target variable is stored in the variable “Y”. This is important for the preparation and therefore differs from other examples in that a conversion (example: TensorFlow) or adaptation to a tensor (example: PyTorch) is not required.
Coding of categorical variables
In addition to numerical values (e.g. 1.2 engineSize), the data table also contains categorical characteristics (model, transmission type, fuel), which correspond to qualitative characteristics. These must be converted into discrete classes (e.g. 1, 2, 3) for readability.
With pd.get_dummies(X)
the categorical characteristics can be converted into dummy variables. This is also known as one-hot encoding or 1-of-n code. For each categorical variable, get_dummies()
creates new columns in the DataFrame, with each column corresponding to a unique category. If a row corresponds to the original categorical variable, the value in the corresponding dummy column is 1, otherwise it is 0. This method is useful when there is no natural order between the categories, such as different car types.
Davon unterscheidet sich label_encoder()
, welcher kategorialen Variablen mit natürlicher Reihenfolge z.B. verschiedenen Getriebetypen (manuell, automatisch usw.) eindeutigen Ganzzahlenwerten zuweist.
Creation and training of the model
In the following sections, the Keras model for predicting the car price is defined. For this purpose, input_shape = X_train_scaled.shape[1]
is used to specify the number of columns used in the scaled training data set. Furthermore, the sequential Keras model is deined with model = Sequential([])
. This is a linear stacking of layers.
The first layer (input layer) of the model is the input layer, in which the previously used input_shape
is used. In the following dense
layers, a number of 64 neurons and the activation function activation= relu
(ReLu = Rectified Linear Unit) are defined. The ReLu function is a non-linear activation function that is frequently used in neural networks. The last layer Dense(1)
defines the output layer with one neuron. This layer does not require an activation function, as it wants to predict a continuous value. A summary of the model is then output with model.summary()
.
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense_3 (Dense) │ (None, 64) │ 2,560 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_4 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_5 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 6,785 (26.50 KB)
Trainable params: 6,785 (26.50 KB)
Non-trainable params: 0 (0.00 B)
To save the evaluated model, we use model.save(name)
. This differs in Keras 3 from, for example, Torch, where we had to add all our encoders and scalers separately to the save process. So that we can tell the German and English versions apart, we simply add a 2 to the name model.save("../models/car_price_prediction_model_2.keras")
.
Use of the model
The model is loaded in Keras 3 simply by model = keras.models.load_model(name)
. The values we have already used in the other notebooks are also defined.
Price forecast
The price of the prediction is slightly lower than the other models. However, a check on comparison portals tells us that the price for a model with this motor line is realistic.