Skip to main content

Using Neptune with training jobs

Modelbit integrates with Neptune using your Neptune API token so you can log training metadata and model performance to your Neptune projects.

To add your Neptune API token to Modelbit, go to the Integrations tab of Settings, click the Neptune tile, and add your NEPTUNE_API_TOKEN. This token will be available in your training jobs' environments as an environment variable so you can automatically authenticate with Neptune.

Creating a training job that uses Neptune

We'll make a training job to train a model to predict flower types, using the Scikit Learn Iris dataset. We'll log the model's hyperparameters and accuracy to Neptune and then deploy the model to a REST endpoint.

Our model will be simple and rely on two features to predict the flower type.

Creating the training job

First, create a function to encapsulate our training logic. At the top of the function we call run = neptune.init_run(...) to start a run and we'll record our hyperparameters with run[...]=. Be sure to change the project= parameter in neptune.init_run.

Then create and fit the model, logging the model's accuracy to Neptune and saving the model with mb.add_model.

import modelbit as mb
import neptune
import os
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
import random

os.environ["NEPTUNE_API_TOKEN"] = mb.get_secret("NEPTUNE_API_TOKEN")

def train_flower_classifier():
# pick our hyperparameters
random_state = random.randint(1, 10_000)
n_estimators = random.randint(2, 10)
max_depth = random.randint(2, 5)

# Init Neptune and log hyperparameters to Neptune
run = neptune.init_run(project="your-workspace/your-project")
run["random_state"] = random_state
run["n_estimators"] = n_estimators
run["max_depth"] = max_depth

# Prepare our dataset
X, y = datasets.load_iris(return_X_y=True, as_frame=True)
X = X[["sepal length (cm)", "sepal width (cm)"]] # only use two features
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=random_state)

model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
model.fit(X_train,y_train)

# Log accuracy to Neptune
predictions = model.predict(X_test)
run["accuracy"] = metrics.accuracy_score(y_test, predictions)
run.stop() # Stop Neptune session

# Save model to the registry
mb.add_model("flower_classifier", model)

Run the training job

We can now send our training function to Modelbit with mb.add_job or using Git.

Once deployed, click the link to view the job in Modelbit. Then click Run Now. Once the job completes, head over to your Neptune project to see that the job logged a new run!