Skip to main content

Deployments: metadata.yaml

The metadata.yaml file defines how Modelbit should run your deployment. It specifies the Python version, argument and return types, and runtime behavior like DataFrame mode.

File format

The file is YAML-formatted. Here are the type definitions and field descriptions:

deployments/example_deployment/metadata.yaml
owner: str
runtimeInfo: Dict[str, Any]
dataframeModeColumns: Optional[List[Dict[str, str]]]
mainFunction: str
mainFunctionArgs: List[str]
pythonVersion: Literal["3.6" | "3.7" | "3.8" | "3.9" | "3.10" | "3.11" | "3.12"]
systemPackages: Optional[List[str]]
snowflakeMaxRows: Optional[int]
snowflakeMockReturnValue: Optional[Any]
capabilities: Optional[List[Literal[gpu=T4 | gpu=A10G ]]]
tags: Optional[List[str]]
schemaVersion: Literal[2]

The fields:

  • owner: The email you're using in Modelbit.

  • runtimeInfo:

    • mainFunction: The name of the function to call in source.py with the arguments sent to your deployment.
    • mainFunctionArgs: The argument names and types used by mainFunction. The values affecting SQL types are int, str, float, bool. Everything else will be treated as Any. There should not be spaces in these values (use foo:int, not foo: int). The type values are used when constructing the SQL functions for this deployment. They do not affect the types of inputs or outputs.
    • dataframeModeColumns: If using DataFrame Mode, specifies the columns and types expected. This field should be omitted when not using DataFrame Mode.
    • pythonVersion: The major-minor version of Python to use.
    • systemPackages: A list of packages to install in the production environment with apt-get.
    • snowflakeMaxRows: The maximum number of rows to allow in each batch.
    • snowflakeMockReturnValue: The mock value to return when using mock mode in Snowflake.
    • capabilities: Used to specify if this deployment needs a GPU.
  • tags: A list of tags to associate with the deployment. Omitted if the deployment doesn't have any tags.

  • schemaVersion: The file format version of this file.

Examples

Here are example metadata.yaml files showing various popular configurations.

Minimum metadata.yaml

The main function in the deployment:

source.py
def example_deployment(a):
return ...

The associated metadata.yaml:

metadata.yaml
owner: you@company.com
runtimeInfo:
mainFunction: example_deployment
mainFunctionArgs:
- a:Any
pythonVersion: "3.9"
schemaVersion: 2

Simple deployment

A main function that specifies argument types and a return type:

source.py
def example_deployment(a: int) -> int:
...

The associated metadata.yaml:

metadata.yaml
owner: you@company.com
runtimeInfo:
mainFunction: example_deployment
mainFunctionArgs:
- a:int
- return:int
pythonVersion: "3.9"
schemaVersion: 2

Complex arguments

A function in that uses more complex types:

source.py
def example_deployment(a: List[str], b: Dict[str, Any]) -> Dict[str, List[Any]]:
...

The associated metadata.yaml:

metadata.yaml
owner: you@company.com
runtimeInfo:
mainFunction: example_deployment
mainFunctionArgs:
- a:list
- b:dict
- return:dict
pythonVersion: "3.9"
schemaVersion: 2

Specifying system packages to install

Use the systemPackages key within runtimeInfo to specify which packages to install with apt-get:

metadata.yaml
owner: you@company.com
runtimeInfo:
mainFunction: example_deployment
mainFunctionArgs:
- a:int
- return:int
pythonVersion: "3.9"
systemPackages:
- build-essential
- libgl1
- libgl1-mesa-glx
- libglib2.0-0
schemaVersion: 2

DataFrame mode

A function based on the DataFrame mode example:

source.py
def get_predictions(features_df: pd.DataFrame) -> np.ndarray:
return ...

The associated metadata:

metadata.yaml
owner: you@company.com
runtimeInfo:
dataframeModeColumns:
- dtype: int64
example: 1
name: feature_one
- dtype: int64
example: 2
name: feature_two
mainFunction: get_predictions
mainFunctionArgs:
- features_df:DataFrame
- return:ndarray
pythonVersion: "3.9"
schemaVersion: 2

Using a GPU

To specify that the deployment needs to run on a T4 GPU, see the capabilities field in the metadata below:

metadata.yaml
owner: you@company.com
runtimeInfo:
capabilities:
- gpu=T4
...
schemaVersion: 2

Specifying tags for a deployment

To add tags to a deployment, set the tags top-level key with an array of strings:

metadata.yaml
owner: you@company.com
runtimeInfo: ...
tags:
- project_foo
- other_tag
schemaVersion: 2

Tags should be listed alphabetically within the metadata.yaml for consistency. And deployments without tags can omit the tags key.

Snowflake mock return values

The following are examples of different formats of mock return values. If omitted the default is SQL's null.

These examples apply to the snowflakeMockReturnValue field within metadata.yaml.

Literals

To return a numeric value like 4:

metadata.yaml
owner: you@company.com
runtimeInfo:
snowflakeMockReturnValue: 4
...
schemaVersion: 2

Or a string value like "An example sentence":

metadata.yaml
owner: you@company.com
runtimeInfo:
snowflakeMockReturnValue: An example sentence
...
schemaVersion: 2

Dictionaries

To return a dictionary like {"key1": "value1", "key2": "value2"}:

metadata.yaml
owner: you@company.com
runtimeInfo:
snowflakeMockReturnValue:
key1: value1
key2: value2
...
schemaVersion: 2

Simple lists

To return a list like [1, "two"]:

metadata.yaml
owner: you@company.com
runtimeInfo:
snowflakeMockReturnValue:
- 1
- two
...
schemaVersion: 2

Lists of dictionaries

To return a list of dictionaries like

[
{"key1": "value1", "key2": "value2"},
{"key1": "value3", "key2": "value4"}
]
metadata.yaml
owner: you@company.com
runtimeInfo:
snowflakeMockReturnValue:
- key1: value1
key2: value2
- key1: value3
key2: value4
...
schemaVersion: 2