Skip to main content

Creating jobs from a notebook

Within a Python notebook define a function that trains and stores it in the model registry. Then create the job in Modelbit with mb.add_job(...):

from sklearn import linear_model

def train():
lm = linear_model.LinearRegression()
lm.fit([[1], [2], [3]], [2, 4, 6])
mb.add_model("example_model", lm)

mb.add_job(train, deployment_name="training_example")

This will create a new deployment called training_example, and create a job called train within it. Click View in Modelbit to see the job. In the Source Code tab you'll also see the code used to define this job, copied from your notebook.

Return to the Jobs tab for train and then click Run Now to run the job. Once the job finishes you can fetch the results of the job in your notebook with mb.get_model:

mb.get_model("example_model")

If you turned the Save changes on success toggle off before running the job, you can fetch the staged results of your job with mb.get_job_output:

mb.get_job_output(deployment_name="training_example", job_name="train", model_name="example_model")

This call returns the linear regression that was created after running the train job. Read on for how to use the results of training jobs in inference functions, as well as how to schedule and parameterize your jobs.

Training jobs examples

Now that you've created your first job let's look at some more advanced use cases. These training jobs take advantage of the model registry.

Retraining a model used in a inference function

In this example, we'll create an example deployment that uses a linear regression to double numbers. Then we'll use a job to retrain and redeploy the linear model used for doubling numbers.

First we'll define a function to train and store our model called train_doubler. Then we'll use the model in an inference function called predict_double:

from sklearn import linear_model
import time

def train_doubler():
lm = linear_model.LinearRegression()
# we're using time.time() to mimic dynamic training data in this example
lm.fit([[1], [2], [time.time()]], [2, 4, time.time() * 2])
mb.add_model("doubler_model", lm)

train_doubler()

def predict_double(number: int) -> int:
doubler_model = mb.get_model("doubler_model")
return doubler_model.predict([[number]])[0]

predict_double(5)

Then we deploy the inference function predict_double:

mb.deploy(predict_double)

At this point Modelbit knows about the inference function and has created API endpoints for it, but Modelbit doesn't have a job to retrain the model.

We'll add our training job train_doubler to our deployment. Whenever the job runs, it'll update doubler_model in the registry, which will be used by predict_double for inferences.

mb.add_job(train_doubler, deployment_name="predict_double")

Click Run Now in the job's page in Modelbit to update doubler_model.

Retraining with refreshed data on a schedule

Like the example above, we'll create a training function and a deployment function. Adding to the above, we'll call mb.get_dataset(...) in our training function to fetch a dataset from Modelbit that we'll use for training. When we send the job to Modelbit we'll configure it to refresh the dataset before running the job every night.

First, the training function train_lead_scorer and deployment function score_lead:

def train_lead_scorer():
training_data = mb.get_dataset("leads")
X = training_data.drop('CONVERTED', axis=1)
y = training_data['CONVERTED']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
...
model = make_pipeline(pipeline, LogisticRegression(max_iter=1000, random_state=42))
model.fit(X_train, y_train)
mb.add_model("lead_scoring_model", model)


def score_lead(hdyhau: str, utm_source: str, industry: str) -> float:
df = pd.DataFrame.from_records([{
"HDYHAU": hdyhau,
"UTM_SOURCE": utm_source,
"INDUSTRY": industry
}])
lead_scoring_model = mb.get_model("lead_scoring_model")
return lead_scoring_model.predict_proba(df)[0][1]

score_lead("email", "google", "Entertainment")

Then we'll deploy score_lead:

mb.deploy(score_lead)

Lastly we create the training job, which will run automatically every night:

mb.add_job(train_lead_scorer,
deployment_name="score_lead", # to add the training job to the `score_lead` deployment
schedule="daily", # to run the training job every night
refresh_datasets=["leads"]) # to refresh the `leads` dataset before executing the training job

Running jobs with arguments

Training jobs can accept arguments, which can be useful for testing different training parameters. This example changes the max_iter parameter of a LogisticRegression using an argument to the training job:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

def train_iris_lr(max_training_iter: int):
X, y = load_iris(return_X_y=True)
clf = LogisticRegression(max_iter=max_training_iter).fit(X, y)
mb.add_model("iris_model", clf)

mb.add_job(train_iris_lr,
deployment_name="training_example", # adding a second job to the first example
default_arguments=[500]) # default arguments can be overridden later

Above, the default_arguments lets us send 500 as the max_training_iter argument in train_iris_lr. If the training function accepted multiple arguments, we'd send multiple values in the list of default_arguments.

The default_arguments parameter must be a list of numbers or strings, and there should be one item in the list for each parameter into the training function.

Running a job and waiting for its results

In addition to creating jobs from the notebook, you can also run jobs and fetch their results. We'll run the train_iris_lr job from the previous example:

job_run_request = mb.run_job(deployment_name="training_example", job_name="train_iris_lr")
job_run_request.wait() # will block until job completes

# then fetch the updated LogisticRegression
mb.get_model("iris_model")

The above job ran with the default arguments of 500. We can also run it with different arguments, in this case 1000:

job_run_request = mb.run_job(deployment_name="training_example", job_name="train_iris_lr", arguments=[1000])
job_run_request.wait()

# then fetch the updated LogisticRegression
mb.get_model("iris_model")

Getting outputs from specific job runs

The previous examples using mb.get_model always fetched the most recently saved model in the registry. To fetch results from previous or pending jobs, use the mb.get_job_output with the run_id parameter.

In Modelbit: You can see the Run ID on the Training Jobs tab of your deployment, next to each recent job run.

In your notebook: Alternatively, if you're using mb.run_job, you can get the run_id off of the returned object job_run_request.run_id

Once you have your run_id, add it to the mb.get_job_output call:

mb.get_job_output(deployment_name="training_example", job_name="train_iris_lr", run_id=2, model_name="iris_model")

Retrieving files generated by training jobs

Advanced training jobs may write files during their output. The files created by training jobs can be pickled models, logs, or anything else relevant to the job. To retrieve the contents of the file into your notebook, use the file_name parameter of mb.get_job_output.

For example, to fetch the contents of a learn_error.tsv file, run:

mb.get_job_output(
deployment_name="my_deployment",
job_name="my_training_job",
run_id=7,
file_name="learn_error.tsv")

Files ending in .pkl will be unpickled. Other files will be returned as text or binary strings, depending on their format.

Additional parameters for mb.add_job

There are several parameters you can use to customize the behavior of your job.

save_on_success=True

By default, training jobs created in the notebook will update and redeploy your inference function with the retrained model if the job completes successfully. To disable this behavior, set save_on_success=False:

mb.add_job(my_job, deployment_name="training_example", save_on_success=False)
tip

You can conditionally chose to prevent redeploying the model. To prevent redeployment exit with a non-zero exit code using sys.exit(1) or raise an exception. In either case Modelbit will consider the job a failure and skip redeploying the new model even if save_on_success is set to True.

store_result_as="my_model"

Training functions can choose have return values. By default, any output of the training job is stored with the same name as the training job. To change the default output file use store_result_as:

mb.add_job(my_job, deployment_name="training_example", store_result_as="my_model")

schedule="cron-string"

Modelbit jobs can be run on any schedule you can define with a cron string. You can also use the simpler schedules of hourly, daily, weekly and monthly:

mb.add_job(my_job, deployment_name="training_example", schedule="daily")
# or
mb.add_job(my_job, deployment_name="training_example", schedule="0 0 * * *")

refresh_datasets=["dataset-name"]

Jobs usually require fresh data to retrain their models. Using the refresh_datasets parameter tells Modelbit to refresh the datasets used by the job before executing the job:

mb.add_job(my_job, deployment_name="training_example", refresh_datasets=["leads"])

size="size"

If your job requires more CPU or RAM than the default job runner you should use a larger runner. Set the size parameter to one of the sizes from the runner sizes table:

mb.add_job(my_job, deployment_name="training_example", size="medium")

email_on_failure="your-email"

Modelbit can email you if your job fails. Just set the email_on_failure parameter to your email address:

mb.add_job(my_job, deployment_name="training_example", email_on_failure="you@company.com")