AI as ghostwriter: Different Ways to Save Your Machine Learning Model

Edward Johnson
5 min readDec 21, 2022

--

Saving your trained machine learning models is a vital step of the workflow and enables you to reuse your models in the future.
I wanted to see if I could write about four ways to save a model in Python regardless of my knowledge level, so utilised AI as my ghostwriter.

Pickle
How to use Pickle to serialize a trained machine learning model and save it to a file

Pickle is a Python library that allows you to serialize Python objects and save them to a file. You can then use the pickled file to reconstruct the original object in another Python script or in the same script. This is useful when you want to save a trained machine learning model so that you can use it at a later time without having to retrain it.

Here’s an example of how to use Pickle to serialize a machine learning model and save it to a file:

In the example above, model is the trained machine learning model that you want to serialize. The wb mode in the open function specifies that you are opening the file for writing in binary mode, and the rb mode specifies that you are opening the file for reading in binary mode.

You can also use the pickle.dump function to write the pickled object to a file-like object, such as a StringIO object, instead of a file on disk. This can be useful if you want to save the model to a database or send it over a network connection.

Note that the Pickle format is not guaranteed to be compatible between Python versions. If you need to serialize a model and save it to a file so that it can be used in a different Python version, you might want to consider using a more portable serialization format, such as json, yaml, or hdf5.

Joblib
How to use joblib to serialize my trained machine learning model and save it to a file

Joblib is a Python library that provides functions for serializing Python objects and saving them to a file. It is similar to Pickle, but is generally faster and more efficient, especially for large objects such as machine learning models.

Here’s an example of how to use Joblib to serialize a machine learning model and save it to a file:

In the example above, model is the trained machine learning model that you want to serialize. The dump function saves the model to a file with the .joblib extension, and the load function loads the model from the file.

You can also use the dump function to write the serialized object to a file-like object, such as a StringIO object, instead of a file on disk. This can be useful if you want to save the model to a database or send it over a network connection.

Note that the Joblib format is not guaranteed to be compatible between Python versions. If you need to serialize a model and save it to a file so that it can be used in a different Python version, you might want to consider using a more portable serialization format, such as json, yaml, or hdf5.

JSON
How to use JSON method to save all of the required parameters to build a fitted model.

The json module in Python provides functions for serializing and deserializing JSON (JavaScript Object Notation) data. You can use this module to save the parameters required to build a fitted machine learning model as a JSON object.

Here’s an example of how to use the json module to save the parameters of a fitted machine learning model:

In the example above, model is the fitted machine learning model, and model_params is a dictionary that contains the type of the model (e.g. 'RandomForestClassifier') and the parameters required to build the model (e.g. {'n_estimators': 100, 'max_depth': 10}).

The json.dump function saves the model_params dictionary as a JSON object to the model_params.json file, and the json.load function loads the JSON object from the file into the model_params variable.

You can then use the model_type and model_params variables to rebuild the model using the ** operator, which expands the model_params dictionary into keyword arguments for the model constructor.

Note that the JSON format is generally portable and can be used to save the parameters of a machine learning model in a way that can be easily read and understood by humans. However, it does not support all types of Python objects, and you might need to use a different serialization format if you want to save more complex objects.

Tensorflow Keras
How to use Tensorflow Keras to save a Tensorflow model to SavedModel or HDF5 file.

TensorFlow is a popular open-source machine learning framework that provides a high-level interface for building, training, and deploying machine learning models. The tf.keras module in TensorFlow provides functions for saving trained models as TensorFlow SavedModels or HDF5 files.

Here’s an example of how to use tf.keras to save a TensorFlow model as a TensorFlow SavedModel:

In the example above, model is a TensorFlow Sequential model that has been compiled and trained on some data. The model.save function saves the model as a TensorFlow SavedModel, which is a directory containing a serialized version of the model and the model's trained weights. The save_format argument specifies that the model should be saved as a TensorFlow SavedModel.

You can also use the model.save function to save the model as an HDF5 file, which is a binary file that stores data in a hierarchical structure:

In the example above, the save_format argument specifies that the model should be saved as an HDF5 file.

You can later load the saved model using the tf.keras.models.load_model function:

The tf.keras.models.load_model function can automatically detect whether the model was saved as a TensorFlow SavedModel or an HDF5 file, and it will return the appropriate model object.

AI was used to write this article as well as influence from a LinkedIn article by Kurtis Pyke. Let me know if this was useful for you.

--

--

Edward Johnson

Ikinique Ltd — Passionate about AI augmentation, soft skills, data science, mentorship, fintech, blockchain, Hyperledger, Ethics #IKEAization