Training jobs: metadata.yaml
The metadata.yaml
file defines how Modelbit should run your training job. It specifies the Python version, argument types and schedule behavior.
File format
The file is YAML-formatted. Here are the type definitions and field descriptions:
owner: str
runtimeInfo: Dict[str, Any]
mainFunction: str
mainFunctionArgs: List[str]
pythonVersion: Literal["3.6" | "3.7" | "3.8" | "3.9" | "3.10" | "3.11" | "3.12"]
systemPackages: Optional[List[str]]
schedules: Optional[List[Dict[str, Any]]]
- beforeStart: Optional[Dict[str, Any]]
refreshDatasets: List[str]
onFailure: Optional[Dict[str, Any]]
sendEmail: str
schedule: cron-string
size: Literal["small", "medium", "large", "xlarge", "2xlarge", "4xlarge", "gpu_small", "gpu_medium", "gpu_large"]
schemaVersion: Literal[1]
The fields:
-
owner
: The email you're using in Modelbit. The job will run as the owner's role. If the user no longer exists in Modelbit the job will stop running. -
runtimeInfo
:mainFunction
: The name of the function to call insource.py
with the arguments sent to your training job.mainFunctionArgs
: The argument names and types used bymainFunction
, formatted as<arg-name>:<arg-type>
. There should not be spaces in these values (i.e. usefoo:int
, notfoo: int
).pythonVersion
: The major-minor version of Python to use.systemPackages
: A list of packages to install in the production environment withapt-get
.
-
schedules
: Optional list of schedule dicts. Each schedule can have the following fields:beforeStart
:refreshDatasets
: List of dataset names to refresh before beginning the training job.
onFailure
:sendEmail
: Email address to alert if the training job exits unsuccessfully.
schedule
: Cron string representing the schedule of the training job.size
: Size of the machine to use when running the job.
-
schemaVersion
: The file format version of this file.
Examples
Here are example metadata.yaml
files showing various popular configurations.
Minimum metadata.yaml
The main function in the training job:
def train_my_predictor():
return ...
The associated metadata.yaml
:
owner: you@company.com
runtimeInfo:
mainFunction: train_my_predictor
mainFunctionArgs: []
pythonVersion: "3.9"
schemaVersion: 1
Specifying system packages to install
Use the systemPackages
key within runtimeInfo
to specify which packages to install with apt-get
:
def train_my_predictor(a: int) -> int:
return ...
The associated metadata.yaml
:
owner: you@company.com
runtimeInfo:
mainFunction: train_my_predictor
mainFunctionArgs:
- a:int
- return:int
pythonVersion: "3.9"
systemPackages:
- build-essential
- libgl1
- libgl1-mesa-glx
- libglib2.0-0
schemaVersion: 1
Daily schedule with dataset refresh
This schedule runs every midnight UTC (0 0 * * *
) and refreshes the dataset my_dataset
before beginning the training job.
owner: you@company.com
runtimeInfo:
mainFunction: train_my_predictor
mainFunctionArgs: []
pythonVersion: "3.10"
systemPackages: null
schedules:
- beforeStart:
refreshDatasets:
- my_dataset
onFailure:
sendEmail: you@company.com
schedule: 0 0 * * *
size: medium
schemaVersion: 1
Daily schedule passing arguments
This training job takes two arguments (a
and b
). The scheduled execution of this job passes 25
and 42
as arguments to train_my_predictor
.
def train_my_predictor(a: int, b: int) -> int:
return ...
The associated metadata.yaml
:
owner: you@company.com
runtimeInfo:
mainFunction: train_my_predictor
mainFunctionArgs:
- a:int
- b:int
pythonVersion: "3.10"
systemPackages: null
schedules:
- arguments:
- 25
- 42
schedule: 0 0 * * *
size: large
schemaVersion: 1
Two schedules
This training job has two schedules. One nightly (passing the argument 1
) and one weekly (passing the argument 2
):
owner: you@company.com
runtimeInfo:
mainFunction: train_my_predictor
mainFunctionArgs:
- a:int
pythonVersion: "3.10"
systemPackages: null
schedules:
- arguments:
- 1
schedule: 0 0 1 * *
size: small
- arguments:
- 2
schedule: 0 0 * * SUN
size: small
schemaVersion: 1