Skip to main content

Using Neptune with training jobs

Modelbit integrates with Neptune using your Neptune API token so you can log training metadata and model performance to your Neptune projects.

To add your Neptune API token to Modelbit, go to the Integrations tab of Settings, click the Neptune tile, and add your NEPTUNE_API_TOKEN. This token will be available in your training jobs' environments as an environment variable so you can automatically authenticate with Neptune.

Creating a training job that uses Neptune

We'll make a training job to train a model to predict flower types, using the Scikit Learn Iris dataset. We'll log the model's hyperparameters and accuracy to Neptune and then deploy the model to a REST endpoint.

Our model is very simple and relies on two features to predict the flower type.

Setup

First, import modelbit and neptune and authenticate your notebook with Modelbit:

import modelbit, neptune

mb = modelbit.login()

If your NEPTUNE_API_TOKEN isn't already in your notebook's environment, add it:

import os

os.environ["NEPTUNE_API_TOKEN"] = mb.get_secret("NEPTUNE_API_TOKEN")

Creating the training job

We'll create a function to encapsulate our training logic. At the top of the function we call run = neptune.init_run(...) to start a run and we'll record our hyperparameters with run[...]=. Be sure to change the project= parameter in neptune.init_run.

Then we create and fit the model, logging the model's accuracy to Neptune and saving the model with mb.add_model.

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
import random

def train_flower_classifier():
# pick our hyperparameters
random_state = random.randint(1, 10_000)
n_estimators = random.randint(2, 10)
max_depth = random.randint(2, 5)

# Init Neptune and log hyperparameters to Neptune
run = neptune.init_run(project="your-workspace/your-project")
run["random_state"] = random_state
run["n_estimators"] = n_estimators
run["max_depth"] = max_depth

# Prepare our dataset
X, y = datasets.load_iris(return_X_y=True, as_frame=True)
X = X[["sepal length (cm)", "sepal width (cm)"]] # only use two features
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=random_state)

model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
model.fit(X_train,y_train)

# Log accuracy to Neptune
predictions = model.predict(X_test)
run["accuracy"] = metrics.accuracy_score(y_test, predictions)
run.stop() # Stop Neptune session

# Save model to the registry
mb.add_model("flower_classifier", model)

Deploy and run the training job

We can now deploy our training function to Modelbit with mb.add_job:

mb.add_job(train_flower_classifier, deployment_name="predict_flower")

Click the View in Modelbit button then click Run Now. Once the job completes, head over to your Neptune project to see that the job logged a new run!

Create a REST endpoint

Finally, we'll deploy our flower predictor model to a REST endpoint. We'll make an inference function that accepts two input features and calls the model we trained, returning the predicted flower type:

flower_names = ["setosa", "versicolor", "virginica"]

def predict_flower(sepal_len: float, sepal_width: float) -> str:
model = mb.get_model("flower_classifier")
predicted_class = model.predict([[sepal_len, sepal_width]])[0]
return flower_names[predicted_class]

Deploy the inference function to create a REST endpoint:

mb.deploy(predict_flower)

Our flower predicting model is live as a REST endpoint, and every time we retrain it the hyperparameters and accuracy are logged to Neptune for careful tracking.