Skip to main content

Storing complex models

By default, passing your model to mb.add_model will serialize your model using the Python standard library's pickle module. But not all models can be serialized using pickle. In these cases you can choose to use an alternative serializer.

tip

Keep in mind that using alternatives to pickle can come with their own drawbacks, and may require you to take extra care when specifying the versions of packages in your environment. When in doubt, stick with the default of pickle.

If you choose to use an alternative serialization library, the model's page in the Model Registry will display which version of the library was used. This way you'll know which version might need to be installed to deserialize the model in the future.

To use an alternative to pickle, set the serializer= parameter when calling mb.add_model or mb.add_models.

Storing models with cloudpickle

The cloudpickle serialization library is more capable than pickle when it comes to storing objects that reference functions. Here's a simple example of storing a model that contains a lambda function, something pickle does not support:

tricky_model = { "func": lambda x : x * 2 }

tricky_model["func"](5) # --> 10

mb.add_model("tricky_model", tricky_model, serializer="cloudpickle")

And then retrieve and use the model as you normally would:

mb.get_model("tricky_model")["func"](5) # --> 10

Storing models as files

Some models need to remains as files and cannot (or should not) be pickled. For these kinds of models, use the file= parameter of mb.add_model and mb.get_model.

For example:

# Adding a file-based model to the registry
mb.add_model("my_model", file="large_model_file.bin")

# Getting the model back from the registry
mb.get_model("my_model", file="large_model_file.bin")

For large file-based models, download them once by calling mb.get_model after your import statements, outside of the inference function.