Batch inference REST requests
TTo retrieve many inferences in a single REST call, use the batch request syntax . It's similar to the single request syntax, but instead takes a list of inference requests. Each item in the list is an inference represented by an ID followed by the arguments for the main function.
Request format
Calling a deployment in batch involves sending POST
request with a list of one or more sets of arguments for the deployment's main function.
For example, suppose the deployment's main function is example_doubler
:
def example_doubler(foo: int):
return 2 * foo
An example POST
data for this deployment to double the numbers 10
, 11
and 12
is:
{
"data": [
[1, 10],
[2, 11],
[3, 12]
]
}
Above, the number 1
is the ID
of the request to double 10
, and will be returned with the output of the main function. The ID
can be any string
or number
that's convenient for your use case. IDs should be unique among the items in a batch, and responses will be returned sorted by the IDs.
Using curl
Calling example_doubler
with curl
:
curl -s -XPOST "https://<your-workspace-url>/v1/example_doubler/latest" \
-d '{"data": [[1, 10], [2, 11], [3, 12]]}'
Using Python
Calling example_doubler
with Python and the requests
package:
import json, requests
requests.post("https://<your-workspace-url>/v1/example_doubler/latest",
headers={"Content-Type":"application/json"},
data=json.dumps({"data": [[1, 10], [2, 11], [3, 12]]})).json()
Response format
The REST response for deployments called in batch is similar to the request format. The response is a list of the outputs from the deployment's main function along with the ID
s that were passed in.
The response from example_doubler
called with the batch of 3 inputs above is:
{
"data": [
[1, 20],
[2, 22],
[3, 24]
]
}
More examples
Here are a few more examples of how to call deployments with batch inference requests.
Dictionary arguments
If your deployment expects a single dictionary you can send that after the ID of the request in the data
parameter.
For example, a request for the following example_dict_doubler
expects an ID and a dictionary, [ID, {"value": <number>}]
:
def example_dict_doubler(d: dict):
return d["value"] * 2
Call example_dict_doubler
by sending requests 1
and 2
as [[1, {"value": 10}], [2, {"value": 20}]]
as the data
:
curl -s -XPOST "https://<your-workspace-url>/v1/example_dict_doubler/latest" \
-d '{"data": [[1, {"value": 10}], [2, {"value": 20}]]}'
Multiple arguments
If your deployment expects multiple arguments you can send them additional terms after the ID in the request array.
For example, a request for the following example_adder
expects an ID and three numeric inputs, [ID, a, b, c]
:
def example_adder(a: int, b: int, c: int):
return a + b + c
Call example_adder
by sending requests 1
and 2
as [[1, 3, 4, 5], [2, 6, 7, 8]]
as the data
:
curl -s -XPOST "https://<your-workspace-url>/v1/example_adder/latest" \
-d '{"data": [[1, 3, 4, 5], [2, 6, 7, 8]]}'