Skip to main content

Getting started your notebook

The model registry stores models for later retrieval. Like the rest of Modelbit, the registry is git-aware so it respects branches and can be reviewed during code reviews.

Storing your first model

To store a model in the registry, call mb.add_model:

# first, make a model
from sklearn import linear_model
model = linear_model.LinearRegression()[[1], [2], [3]], [2, 4, 6])

# store the model, we'll call it "example_model"
mb.add_model("example_model", model)

The linear regression named model has been stored in the registry as example_model.

To fetch the model from the registry, call use mb.get_model:

# retrieve the model
my_model = mb.get_model("example_model")

# test that it works
my_model.predict([[5]])[0] # --> 9.999...

Storing many models

If you have several (or thousands) of models to store in the registry you'll want them organized. The model registry is organized like a file system, where models belong to directories.

To add several model in a various different directories, use mb.add_models:

"marketing/predictor1": model1,
"marketing/predictor2": model2,
"finance/fraud_scorer": model3,
"finance/latency_scorer": model4,

The model registry in the web app will show these models grouped into marketing and finance directories.

Like before, retrieving a model is done by name:

my_model = mb.get_model("marketing/predictor2")

Now that you can store and retrieve models, let's use the model registry with a deployment.

Storing complex models

By default, passing your model to add_model will serialize your model using the Python standard library's pickle module. But not all models can be serialized using pickle. In these cases you can choose to use an alternative serializer.


Keep in mind that using alternatives to pickle can come with their own drawbacks, and may require you to take extra care when specifying the versions of packages in your environment. When in doubt, stick with the default of pickle.

If you choose to use an alternative serialization library, the model's page in the Model Registry will display which version of the library was used. This way you'll know which version might need to be installed to deserialize the model in the future.

To use an alternative to pickle, set the serializer= parameter when calling add_model or add_models.

Storing models with cloudpickle

The cloudpickle serialization library is more capable than pickle when it comes to storing objects that reference functions. Here's a simple example of storing a model that contains a lambda function, something pickle does not support:

tricky_model = { "func": lambda x : x * 2 }

tricky_model["func"](5) # --> 10

mb.add_model("tricky_model", tricky_model, serializer="cloudpickle")

And then retrieve and use the model as you normally would:

mb.get_model("tricky_model")["func"](5) # --> 10

Storing models as files

Some models need to remains as files and cannot (or should not) be pickled. For these kinds of models, use the file= parameter of add_model and get_model.

For example:

# adding a file-based model to the registry
mb.add_model("my_model", file="large_model_file.bin")

# getting the model back from the registry
mb.get_model("my_model", file="large_model_file.bin")

For large file-based models, download them once by calling mb.get_model after your import statements, outside of the inference function.