Deployments: `metadata.yaml`

The metadata.yaml file defines how Modelbit should run your deployment. It specifies the Python version, argument and return types, and runtime behavior like DataFrame mode.

File format

The file is YAML-formatted. Here are the type definitions and field descriptions:

deployments/example_deployment/metadata.yaml
owner: str
runtimeInfo: Dict[str, Any]
  dataframeModeColumns: Optional[List[Dict[str, str]]]
  mainFunction: str
  mainFunctionArgs: List[str]
  pythonVersion: Literal["3.6" | "3.7" | "3.8" | "3.9" | "3.10" | "3.11" | "3.12"]
  systemPackages: Optional[List[str]]
  snowflakeMaxRows: Optional[int]
  snowflakeMockReturnValue: Optional[Any]
  capabilities: Optional[List[Literal[gpu=T4 | gpu=A10G ]]]
schemaVersion: Literal[2]

The fields:

owner: The email you're using in Modelbit.
runtimeInfo:
- mainFunction: The name of the function to call in source.py with the arguments sent to your deployment.
- mainFunctionArgs: The argument names and types used by mainFunction. The values affecting SQL types are int, str, float, bool. Everything else will be treated as Any. There should not be spaces in these values (use foo:int, not foo: int). The type values are used when constructing the SQL functions for this deployment. They do not affect the types of inputs or outputs.
- dataframeModeColumns: If using DataFrame Mode, specifies the columns and types expected. This field should be omitted when not using DataFrame Mode.
- pythonVersion: The major-minor version of Python to use.
- systemPackages: A list of packages to install in the production environment with apt-get.
- snowflakeMaxRows: The maximum number of rows to allow in each batch.
- snowflakeMockReturnValue: The mock value to return when using mock mode in Snowflake.
- capabilities: Used to specify if this deployment needs a GPU.
schemaVersion: The file format version of this file.

Examples

Here are example metadata.yaml files showing various popular configurations.

Minimum metadata.yaml

The main function in the deployment:

source.py
def example_deployment(a):
    return ...

The associated metadata.yaml:

metadata.yaml
owner: you@company.com
runtimeInfo:
  mainFunction: example_deployment
  mainFunctionArgs:
    - a:Any
  pythonVersion: "3.9"
schemaVersion: 2

Simple deployment

A main function that specifies argument types and a return type:

source.py
def example_deployment(a: int) -> int:
    ...

The associated metadata.yaml:

metadata.yaml
owner: you@company.com
runtimeInfo:
  mainFunction: example_deployment
  mainFunctionArgs:
    - a:int
    - return:int
  pythonVersion: "3.9"
schemaVersion: 2

Complex arguments

A function in that uses more complex types:

source.py
def example_deployment(a: List[str], b: Dict[str, Any]) -> Dict[str, List[Any]]:
    ...

The associated metadata.yaml:

metadata.yaml
owner: you@company.com
runtimeInfo:
  mainFunction: example_deployment
  mainFunctionArgs:
    - a:list
    - b:dict
    - return:dict
  pythonVersion: "3.9"
schemaVersion: 2

Specifying system packages to install

Use the systemPackages key within runtimeInfo to specify which packages to install with apt-get:

metadata.yaml
owner: you@company.com
runtimeInfo:
  mainFunction: example_deployment
  mainFunctionArgs:
    - a:int
    - return:int
  pythonVersion: "3.9"
  systemPackages:
    - build-essential
    - libgl1
    - libgl1-mesa-glx
    - libglib2.0-0
schemaVersion: 2

DataFrame mode

A function based on the DataFrame mode example:

source.py
def get_predictions(features_df: pd.DataFrame) -> np.ndarray:
  return ...

The associated metadata:

metadata.yaml
owner: you@company.com
runtimeInfo:
  dataframeModeColumns:
    - dtype: int64
      example: 1
      name: feature_one
    - dtype: int64
      example: 2
      name: feature_two
  mainFunction: get_predictions
  mainFunctionArgs:
    - features_df:DataFrame
    - return:ndarray
  pythonVersion: "3.9"
schemaVersion: 2

Using a GPU

To specify that the deployment needs to run on a T4 GPU, see the capabilities field in the metadata below:

metadata.yaml
owner: you@company.com
runtimeInfo:
  capabilities:
    - gpu=T4
  ...
schemaVersion: 2

Snowflake mock return values

The following are examples of different formats of mock return values. If omitted the default is SQL's null.

These examples apply to the snowflakeMockReturnValue field within metadata.yaml.

Literals

To return a numeric value like 4:

metadata.yaml
owner: you@company.com
runtimeInfo:
  snowflakeMockReturnValue: 4
  ...
schemaVersion: 2

Or a string value like "An example sentence":

metadata.yaml
owner: you@company.com
runtimeInfo:
  snowflakeMockReturnValue: An example sentence
  ...
schemaVersion: 2

Dictionaries

To return a dictionary like {"key1": "value1", "key2": "value2"}:

metadata.yaml
owner: you@company.com
runtimeInfo:
  snowflakeMockReturnValue:
    key1: value1
    key2: value2
  ...
schemaVersion: 2

Simple lists

To return a list like [1, "two"]:

metadata.yaml
owner: you@company.com
runtimeInfo:
  snowflakeMockReturnValue:
    - 1
    - two
  ...
schemaVersion: 2

Lists of dictionaries

To return a list of dictionaries like

[
  {"key1": "value1", "key2": "value2"},
  {"key1": "value3", "key2": "value4"}
]

metadata.yaml
owner: you@company.com
runtimeInfo:
  snowflakeMockReturnValue:
    - key1: value1
      key2: value2
    - key1: value3
      key2: value4
  ...
schemaVersion: 2

File format​

Examples​

Minimum metadata.yaml​

Simple deployment​

Complex arguments​

Specifying system packages to install​

DataFrame mode​

Using a GPU​

Snowflake mock return values​

Literals​

Dictionaries​

Simple lists​

Lists of dictionaries​

File format

Examples

Minimum metadata.yaml

Simple deployment

Complex arguments

Specifying system packages to install

DataFrame mode

Using a GPU

Snowflake mock return values

Literals

Dictionaries

Simple lists

Lists of dictionaries