Batch inference REST requests

To process many inferences in a single REST call, use the batch inference syntax . It's similar to the single inference syntax, but instead takes a list of inference requests. Each item in the request's list is an ID followed by the arguments for the inference function.

Making requests

Suppose our deployment's main function is example_doubler:

def example_doubler(foo: int):
  return 2 * foo

In this example we'll double a batch of three numbers, 10, 11, 12, with example_doubler. The data payload for the batch inference request looks like [<ID>, ...arguments][]. In this example, our payload look like this:

[
  [1, 10],
  [2, 11],
  [3, 12]
]

We're using the 1, 2, and 3 as the IDs for the inferences in the batch.

modelbit.get_inference
requests
curl

Calling example_doubler with modelbit.get_inference:

import modelbit

modelbit.get_inference(
  deployment="example_doubler",
  workspace="<YOUR_WORKSPACE>",
  region="<YOUR_REGION>",
  data=[[1, 10], [2, 11], [3, 12]])

# return value
{"data": [[1, 20], [2, 22], [3, 24]] }

Calling example_doubler with the requests package:

import json, requests

requests.post("https://<your-workspace-url>/v1/example_doubler/latest",
              headers={"Content-Type":"application/json"},
              data=json.dumps({"data": [[1, 10], [2, 11], [3, 12]]})).json()

# return value
{"data": [[1, 20], [2, 22], [3, 24]] }

Calling example_doubler with curl:

curl -s -XPOST "https://<your-workspace-url>/v1/example_doubler/latest" -d '{"data": [[1, 10], [2, 11], [3, 12]]}'

# return value
'{"data": [[1, 20], [2, 22], [3, 24]]}'

The response is a list of results, paired with the IDs from the request payload.

Request format

The format of a batch request is a list of lists. Each item in the inner list is of the format [<ID>, *arguments]. You can choose any value for the ID of each inference request in the batch. Here are some examples calling functions expecting different kinds of arguments:

Dictionary arguments

This deployment expects a dictionary:

def my_deployment(d: Dict[str, float]):
  ...

To send the two requests {"my_key": 25} and {"my_key": 30} as a batch input, send the dictionaries in a list as the data parameter:

modelbit.get_inference
requests
curl

import modelbit

modelbit.get_inference(..., data=[[1, {"my_key": 25}], [2, {"my_key": 30}]] )

import json, requests

requests.post("https://...", data=json.dumps({"data": [[1, {"my_key": 25}], [2, {"my_key": 30}]]})).json()

curl -s -XPOST "https://..." -d '{"data": [[1, {"my_key": 25}], [2, {"my_key": 30}]]}'

Multiple arguments

This deployment expects several arguments:

def my_deployment(a: int, b: str, c: float):
  ...

To send the two requests 1, "two", 3.0 and 4, "five", 6.0 as a batch input, send them in a list in the data parameter:

modelbit.get_inference
requests
curl

import modelbit

modelbit.get_inference(..., data=[[1, 1, "two", 3.0], [2, 4, "five", 6.0]])

import json, requests

requests.post("https://...", data=json.dumps({"data": [[1, 1, "two", 3.0], [2, 4, "five", 6.0]]})).json()

curl -s -XPOST "https://..." -d '{"data": [[1, 1, "two", 3.0], [2, 4, "five", 6.0]]}'

Making requests​

Request format​

Dictionary arguments​

Multiple arguments​

Making requests

Request format

Dictionary arguments

Multiple arguments