Creating jobs with git
Jobs are defined in a jobs.yaml
file, next to the metadata.yaml
file inside each deployment's directory. Each job must have a name and a command to run. In this example, the name of the job is train
, and when executed it runs python train.py
in deployment's directory:
jobs:
train:
command: python train.py
schemaVersion: 1
For ease of development, jobs are meant to have the same behavior locally as in the Modelbit environment. In the example above, running python train.py
locally should write out a new version of your model (e.g. to data/model.pkl
). That .pkl
would be the same file loaded in source.py
for performing inferences.
Customizing job behavior
Jobs can be customized to deploy new versions of the model, run on a schedule, as well as refresh the datasets they depend on.
Redeploying on success
If your training job writes new files (e.g. an updated data/model.pkl
file) and exits successfully, you can tell Modelbit to use those updated files to create a new version of your deployment. Set pushBranch: true
within the onSuccess
key:
jobs:
train:
command: python train.py
onSuccess:
pushBranch: true
schemaVersion: 1
If no files are changed, or if the code errors or exits, the pushBranch: true
will be ignored.
You can conditionally choose not to redeploy the model by (1) throwing an exception, (2) using sys.exit(1)
with a non-zero exit code, or (3) not writing out the new model pickle file.
Setting a schedule
To run your job on a recurring schedule, use cron-style string with the schedule
key. The following example runs python train.py
every day at UTC midnight:
jobs:
train:
command: python train.py
schedule: 0 0 * * *
schemaVersion: 1
Refreshing datasets
Jobs usually require fresh data to retrain their models. Using the refreshDatasets
key inside beforeStart
tells Modelbit to refresh the datasets used by the job before executing the job:
jobs:
train:
beforeStart:
refreshDatasets:
- dataset1
- dataset2
command: python train.py
schemaVersion: 1
If any dataset errors while refreshing (for example, if a table is missing) then the job will be marked as failed.
Runner size
If your job requires more CPU or RAM than the default job runner you should use a larger runner. Set the size
parameter to one of the sizes from the runner sizes table:
jobs:
train:
command: python train.py
size: medium
schemaVersion: 1
Email alerts if jobs fail
Modelbit can email you if your job fails. Just set sendEmail
within the onFailure
key to your email address:
jobs:
train:
command: python train.py
onFailure:
sendEmail: you@company.com
schemaVersion: 1
Example job and inference files sharing a model
The following example shows a training job that creates and saves a model that doubles numbers. That model is then used in a deployment's inference function.
train.py
defines the job's Python code:
from sklearn.linear_model import LinearRegression
import pickle, time
if __name__ == "__main__":
model = LinearRegression()
model.fit([[1], [2], [3], [time.time()]], [2, 4, 6, time.time() * 2])
with open("data/model.pkl", "wb") as f:
pickle.dump(model, f)
jobs.yaml
defines the training job that will create a new version of the deployment after running:
jobs:
train:
command: python train.py
onSuccess:
pushBranch: true
schemaVersion: 1
source.py
defines the code of the inference function that uses our trained model:
from sklearn.linear_model import LinearRegression
import pickle
with open("data/model.pkl", "rb") as f:
model = pickle.load(f)
def doubler(a: int) -> float:
return model.predict([[a]])[0]
# to test locally
if __name__ == "__main__":
print(doubler(21))
metadata.yaml
defines how to call our doubler
function for inferences:
owner: you@company.com
runtimeInfo:
mainFunction: doubler
mainFunctionArgs:
- a:int
- return:float
pythonVersion: "3.8"
schemaVersion: 2
Finally, a requirements.txt
to define the environment's dependencies:
scikit-learn==1.0.2
Run train.py
locally to create the first version of model.pkl
. Then git push
. Checking these files in via git will create a deployment with a training job that retrains and redeploys the model.
Full schema for jobs.yaml
A jobs.yaml
that refreshes two datasets, redeploys to the current branch on success, and runs on a daily schedule looks like the following:
jobs:
train:
beforeStart:
refreshDatasets:
- dataset1
- dataset2
command: python train.py
onFailure:
sendEmail: you@company.com
onSuccess:
pushBranch: true
size: small
schedule: 0 0 * * *
schemaVersion: 1