Skip to main content

Creating jobs from a notebook

Within a notebook, use a function to encapsulate the code you need to train your model. Then decorate that function with @modelbit.job.

Later, when you use mb.deploy(...), Modelbit will recognize that your model came from a function decorated with @modelbit.job and automatically create a job to retrain it.

Here's an example where we train a linear regression called model with the function train(), and then use model in our deployment function doubler(...).

from sklearn.linear_model import LinearRegression
import time

@modelbit.job
def train():
lm = LinearRegression()
lm.fit([[1], [2], [3], [time.time()]], [2, 4, 6, time.time() * 2])
return lm

model = train()

def doubler(a: int) -> float:
return model.predict([[a]])[0]

mb.deploy(doubler)

After deploying you'll see that a job called train has been automatically created in the Modelbit web app. You can run the job by clicking the Run Now button, or run the job automatically by editing the job and adding a schedule.

Optional parameters for @modelbit.job

There are several parameters you can use to customize the behavior of your job.

redeploy_on_success=True

By default, training jobs created in the notebook will redeploy your deployment with the retrained model after they complete successfully. To disable this behavior, set redeploy_on_success=False:

@modelbit.job(redeploy_on_success=False)
def train():
...
tip

You can conditionally chose to prevent redeploying the model. To prevent redeployment exit with a non-zero exit code using sys.exit(1) or raise an exception. In either case Modelbit will consider the job a failure and skip redeploying the new model even if redeploy_on_success is set to True.

schedule="cron-string"

Modelbit jobs can be run on any schedule you can define with a cron string. You can also use the simpler schedules of hourly, daily, weekly and monthly:

@modelbit.job(schedule="daily")
def train1():
...

@modelbit.job(schedule="0 0 * * *")
def train2():
...

refresh_datasets=["dataset-name"]

Jobs usually require fresh data to retrain their models. Using the refresh_datasets parameter tells Modelbit to refresh the datasets used by the job before executing the job:

@modelbit.job(refresh_datasets=["leads"])
def train():
df = mb.get_dataset("leads")
...

size="size"

If your job requires more CPU or RAM than the default job runner you should use a larger runner. Set the size parameter to one of the sizes from the runner sizes table:

@modelbit.job(size="medium")
def train():
...

email_on_failure="your-email"

Modelbit can email you if your job fails. Just set the email_on_failure parameter to your email address:

@modelbit.job(email_on_failure="you@company.com")
def train():
...

Using multiple parameters

You can pass multiple parameters to @modelbit.job:

@modelbit.job(redeploy_on_success=False, refresh_datasets=["leads"])
def train():
df = mb.get_dataset("leads")
...