Creating jobs with git
Training jobs are defined in a jobs.yaml
file, next to the metadata.yaml
file inside each deployment's directory. Each job must have a name and a command to run. In this example, the name of the job is train
, and when executed it runs python train.py
in deployment's directory:
jobs:
train:
command: python train.py
schemaVersion: 1
For ease of development, jobs are meant to have the same behavior locally as in the Modelbit environment. In the example above, running python train.py
locally should write out a new version of your model (e.g. to data/model.pkl
). That .pkl
would be the same file loaded in source.py
for performing inferences.
Create your first job using mb.add_job(...)
in a notebook, then modify the generated files to simplify job configuration via git.
Customizing job behavior
Jobs can be customized to deploy new versions of the model, run on a schedule, as well as refresh the datasets they depend on.
Setting a schedule
To run your job on a recurring schedule, use cron-style string with the schedule
key. The following example runs python train.py
every day at UTC midnight:
jobs:
train:
command: python train.py
schedule: 0 0 * * *
schemaVersion: 1
Refreshing datasets
Jobs usually require fresh data to retrain their models. Using the refreshDatasets
key inside beforeStart
tells Modelbit to refresh the datasets used by the job before executing the job:
jobs:
train:
beforeStart:
refreshDatasets:
- dataset1
- dataset2
command: python train.py
schemaVersion: 1
If any dataset errors while refreshing (for example, if a table is missing) then the job will be marked as failed.
Runner size
If your job requires more CPU or RAM than the default job runner you should use a larger runner. Set the size
parameter to one of the sizes from the runner sizes table:
jobs:
train:
command: python train.py
size: medium
schemaVersion: 1
Passing command line arguments
If your training job requires arguments to change its behavior (e.g. setting a model tuning parameter), add them as a list the arguments
key. Arguments must be numbers or strings, and will be added to the command
as a suffix. The arguments specified in jobs.yaml
are the default arguments, and can be overridden when running a job.
jobs:
train:
arguments:
- 42
command: python train.py
size: small
schemaVersion: 1
Timeouts
To limit the time that your job is allowed to run, set the timeoutMinutes
parameter to an integer between 5 and 1440 (1 day):
jobs:
train:
command: python train.py
size: small
timeoutMinutes: 10
schemaVersion: 1
Email alerts if jobs fail
Modelbit can email you if your job fails. Just set sendEmail
within the onFailure
key to your email address:
jobs:
train:
command: python train.py
onFailure:
sendEmail: you@company.com
schemaVersion: 1
Example job and inference files sharing a model
The following example shows a training job that creates and saves a model that doubles numbers. That model is then used in a deployment's inference function.
train.py
defines the job's Python code:
from sklearn.linear_model import LinearRegression
import pickle, time
if __name__ == "__main__":
model = LinearRegression()
model.fit([[1], [2], [3], [time.time()]], [2, 4, 6, time.time() * 2])
with open("data/model.pkl", "wb") as f:
pickle.dump(model, f)
jobs.yaml
defines the training job:
jobs:
train:
command: python train.py
schemaVersion: 1
source.py
defines the code of the inference function that uses our trained model:
from sklearn.linear_model import LinearRegression
import pickle
with open("data/model.pkl", "rb") as f:
model = pickle.load(f)
def doubler(a: int) -> float:
return model.predict([[a]])[0]
# to test locally
if __name__ == "__main__":
print(doubler(21))
metadata.yaml
defines how to call our doubler
function for inferences:
owner: you@company.com
runtimeInfo:
mainFunction: doubler
mainFunctionArgs:
- a:int
- return:float
pythonVersion: "3.8"
schemaVersion: 2
Finally, a requirements.txt
to define the environment's dependencies:
scikit-learn==1.0.2
Run train.py
locally to create the first version of model.pkl
. Then git push
. Checking these files in via git will create a deployment with a training job that retrains and redeploys the model.
Full schema for jobs.yaml
A jobs.yaml
that refreshes two datasets and runs on a daily schedule looks like the following:
jobs:
train:
arguments:
- "string_arg"
- 5
beforeStart:
refreshDatasets:
- dataset1
- dataset2
command: python train.py
onFailure:
sendEmail: you@company.com
size: small
schedule: 0 0 * * *
schemaVersion: 1