modelbit.get_inference(deployment=, ...)
An easy way to call the REST API for your deployment.
Parameters
deployment
:str
The name of the deployment to receive the inference request.data
:Any
The data to send to the deployment. Can be formatted for single or batch inferences.branch
:Optional[str]
The branch the deployment is on. If unspecified the current branch is used, which ismain
by default.version
:Optional[Union[str,int]]
The version of the deployment to call. Can belatest
, a numeric version, or an alias. If unspecific,latest
is used.region
:Optional[str]
The region of your Modelbit workspace. If unspecified,app
is used. Customers in other regions (e.g.us-east-1
) must specify the region.workspace
:Optional[str]
The name of your Modelbit workspace. If unspecified the value in theMB_WORKSPACE_NAME
envvar will be used. If no workspace name can be found then error will be raised.api_key
:Optional[str]
The API key to send along with the request. If unspecified the value in theMB_API_KEY
envvar will be used. If no workspace name can be found then error will be raised. Required if your workspace uses API keys to authenticate inference requests.batch_size
:Optional[int]
If passing in a large DataFrame or a large batch, theget_inference
call will subdivide the request into multiple requests. Setting this parameter changes the size of the subdivisions. The default is10_000
.
Returns
Dict[str, Any]
- The results of calling the REST API. Successful calls have data
key with the results and unsuccessful calls have an error
key with the error message.
Examples
Most of these examples assume the envvars MB_WORKSPACE_NAME
and MB_API_KEY
(if needed) have already been set.
Get one inference
import modelbit
modelbit.get_inference(deployment="example_model", data=10)
Get a batch of inferences
modelbit.get_inference(deployment="example_model", data=[[1, 10], [2, 11]])
Specify the branch and version
modelbit.get_inference(deployment="example_model", data=10, branch="my_branch", version=5)
Specify the workspace and region
modelbit.get_inference(deployment="example_model", data=10, workspace="my_workspace", region="ap-south-1")
Set a timeout on your inference
modelbit.get_inference(deployment="example_model", data=10, timeout_seconds=20)
Using a dataframe
If your model is using DataFrame mode, you can send the dataframe in data=
:
modelbit.get_inference(deployment="example_model", data=my_dataframe)
Splitting a dataframe into smaller chunks
If your model is using DataFrame mode with a very large dataframe it will automatically get split into into several batches. You can change the batch size with batch_size=
modelbit.get_inference(deployment="example_model", data=my_dataframe, batch_size=500)