Skip to main content

modelbit.get_inference(deployment=, data=, ...)

An easy way to call the REST API for your deployment.


  • deployment: str The name of the deployment to receive the inference request.
  • data: Any The data to send to the deployment. Can be formatted for single or batch inferences.
  • branch: Optional[str] The branch the deployment is on. If unspecified main is used, or the last call to mb.switch_branch
  • version: Optional[Union[str,int]] The version of the deployment to call. Can be latest, a numeric version, or an alias. If unspecific, latest is used.
  • region: Optional[str] The region of your Modelbit workspace. If unspecified, app is used. Customers in other regions (e.g. us-east-1) must specify the region.
  • workspace: Optional[str] The name of your Modelbit workspace. If unspecified the value in the MB_WORKSPACE_NAME envvar will be used. If no workspace name can be found then error will be raised.
  • api_key: Optional[str] The API key to send along with the request. If unspecified the value in the MB_API_KEY envvar will be used. If no workspace name can be found then error will be raised. Required if your workspace uses API keys to authenticate inference requests.
  • batch_size: Optional[int] If passing in a large DataFrame or a large batch, the get_inference call will subdivide the request into multiple requests. Setting this parameter changes the size of the subdivisions. The default is 10_000.


Dict[str, Any] - The results of calling the REST API. Successful calls have data key with the results and unsuccessful calls have an error key with the error message.


Most of these examples assume the envvars MB_WORKSPACE_NAME and MB_API_KEY (if needed) have already been set.

Get one inference

import modelbit

modelbit.get_inference(deployment="example_model", data=10)

Get a batch of inferences

modelbit.get_inference(deployment="example_model", data=[[1, 10], [2, 11]])

Specify the branch and version

modelbit.get_inference(deployment="example_model", data=10, branch="my_branch", version=5)

Specify the workspace and region

modelbit.get_inference(deployment="example_model", data=10, workspace="my_workspace", region="ap-south-1")

Set a timeout on your inference

modelbit.get_inference(deployment="example_model", data=10, timeout_seconds=20)

Using a dataframe

If your model is using DataFrame mode, you can send the dataframe in data=:

modelbit.get_inference(deployment="example_model", data=my_dataframe)

Splitting a dataframe into smaller chunks

If your model is using DataFrame mode with a very large dataframe it will automatically get split into into several batches. You can change the batch size with batch_size=

modelbit.get_inference(deployment="example_model", data=my_dataframe, batch_size=500)

See also