Calling the REST API
Use the Modelbit REST API to add inferences from your deployments into products, websites, and apps. Calling the REST API is easy:
- modelbit.get_inference
- requests
- curl
Using modelbit.get_inference
is the recommended approach:
import modelbit
modelbit.get_inference(
workspace="<YOUR_WORKSPACE>",
region="<YOUR_REGION>",
deployment="example",
data={...})
You can use the Python requests
library to call your deployment's API:
import requests
requests.post("https://<your-workspace-url>/v1/example/latest",
headers={"Content-Type":"application/json"},
data=json.dumps({"data": ...})).json()
If you're not in a Python environment you can use standard REST POST requests. Here's an example using curl
.
curl -s -XPOST "https://<your-workspace-url>/v1/example/latest" -d '{"data": ...}'
Making requests
The REST API supports two different request formats, depending on if you want a single inference or a batch of inferences.
- Single inference requests: The easiest way to fetch inferences from a deployment is one inference at a time. The single inference syntax is suited for use cases where you want predictions about one customer or event per API call.
- Batch inference requests: To retrieve many inferences in a single REST call, use the batch request syntax . It's similar to the single request syntax, but instead takes a list of inference requests.
See also
- Using API keys: You can limit access to your deployed models using API keys. You can create API keys in the Settings area of Modelbit, and send them to deployments in the
Authorization
header. - Large responses can be returned as links instead of streams, for clients who prefer that format.
- Async responses are helpful when your client cannot wait for a response and wants a callback instead.
- Response timeouts can be adjusted to change the amount of time an inference is allowed to run.