Getting started your notebook
The model registry stores models for later retrieval. Like the rest of Modelbit, the registry is git-aware so it respects branches and can be reviewed during code reviews.
Storing your first model
To store a model in the registry, call mb.add_model
:
# first, make a model
from sklearn import linear_model
model = linear_model.LinearRegression()
model.fit([[1], [2], [3]], [2, 4, 6])
# store the model, we'll call it "example_model"
mb.add_model("example_model", model)
The linear regression named model
has been stored in the registry as example_model
.
To fetch the model from the registry, call use mb.get_model
:
# retrieve the model
my_model = mb.get_model("example_model")
# test that it works
my_model.predict([[5]])[0] # --> 9.999...
Storing many models
If you have several (or thousands) of models to store in the registry you'll want them organized. The model registry is organized like a file system, where models belong to directories.
To add several model in a various different directories, use mb.add_models
:
mb.add_models({
"marketing/predictor1": model1,
"marketing/predictor2": model2,
"finance/fraud_scorer": model3,
"finance/latency_scorer": model4,
})
The model registry in the web app will show these models grouped into marketing
and finance
directories.
Like before, retrieving a model is done by name:
my_model = mb.get_model("marketing/predictor2")
Now that you can store and retrieve models, let's use the model registry with a deployment.
Storing complex models
By default, passing your model to add_model
will serialize your model using the Python standard library's pickle
module. But not all models can be serialized using pickle
. In these cases you can choose to use an alternative serializer.
Keep in mind that using alternatives to pickle
can come with their own drawbacks, and may require you to take extra care when specifying the versions of packages in your environment. When in doubt, stick with the default of pickle
.
If you choose to use an alternative serialization library, the model's page in the Model Registry will display which version of the library was used. This way you'll know which version might need to be installed to deserialize the model in the future.
To use an alternative to pickle
, set the serializer=
parameter when calling add_model
or add_models
.
Storing models with cloudpickle
The cloudpickle
serialization library is more capable than pickle
when it comes to storing objects that reference functions. Here's a simple example of storing a model that contains a lambda function, something pickle
does not support:
tricky_model = { "func": lambda x : x * 2 }
tricky_model["func"](5) # --> 10
mb.add_model("tricky_model", tricky_model, serializer="cloudpickle")
And then retrieve and use the model as you normally would:
mb.get_model("tricky_model")["func"](5) # --> 10