Skip to main content

Batch inference REST requests

To process many inferences in a single REST call, use the batch inference syntax . It's similar to the single inference syntax, but instead takes a list of inference requests. Each item in the request's list is an ID followed by the arguments for the inference function.

Making requests

Suppose our deployment's main function is example_doubler:

def example_doubler(foo: int):
return 2 * foo

In this example we'll double a batch of three numbers, 10, 11, 12, with example_doubler. The data payload for the batch inference request looks like [<ID>, ...arguments][]. In this example, our payload look like this:

[
[1, 10],
[2, 11],
[3, 12]
]

We're using the 1, 2, and 3 as the IDs for the inferences in the batch.

Calling example_doubler with modelbit.get_inference:

import modelbit

modelbit.get_inference(
deployment="example_doubler",
workspace="<YOUR_WORKSPACE>",
region="<YOUR_REGION>",
data=[[1, 10], [2, 11], [3, 12]])

# return value
{"data": [[1, 20], [2, 22], [3, 24]] }

The response is a list of results, paired with the IDs from the request payload.

Request format

The format of a batch request is a list of lists. Each item in the inner list is of the format [<ID>, *arguments]. You can choose any value for the ID of each inference request in the batch. Here are some examples calling functions expecting different kinds of arguments:

Dictionary arguments

This deployment expects a dictionary:

def my_deployment(d: Dict[str, float]):
...

To send the two requests {"my_key": 25} and {"my_key": 30} as a batch input, send the dictionaries in a list as the data parameter:

import modelbit

modelbit.get_inference(..., data=[[1, {"my_key": 25}], [2, {"my_key": 30}]] )

Multiple arguments

This deployment expects several arguments:

def my_deployment(a: int, b: str, c: float):
...

To send the two requests 1, "two", 3.0 and 4, "five", 6.0 as a batch input, send them in a list in the data parameter:

import modelbit

modelbit.get_inference(..., data=[[1, 1, "two", 3.0], [2, 4, "five", 6.0]])