Calling the REST API
Use the Modelbit REST API to add inferences from your deployments into products, websites, and apps. Calling the REST API is easy:
- modelbit.get_inference
- requests
- curl
Using modelbit.get_inference is the recommended approach:
import modelbit
modelbit.get_inference(
workspace="<YOUR_WORKSPACE>",
region="<YOUR_REGION>",
deployment="example",
data={...})
You can use the Python requests library to call your deployment's API:
import requests
requests.post("https://<your-workspace-url>/v1/example/latest",
headers={"Content-Type":"application/json"},
data=json.dumps({"data": ...})).json()
If you're not in a Python environment you can use standard REST POST requests. Here's an example using curl.
curl -s -XPOST "https://<your-workspace-url>/v1/example/latest" -d '{"data": ...}'
Making requests
The REST API supports two different request formats, depending on if you want a single inference or a batch of inferences.
- Single inference requests: The easiest way to fetch inferences from a deployment is one inference at a time. The single inference syntax is suited for use cases where you want predictions about one customer or event per API call.
- Batch inference requests: To retrieve many inferences in a single REST call, use the batch request syntax . It's similar to the single request syntax, but instead takes a list of inference requests.
See also
- Using API keys: You can limit access to your deployed models using API keys. You can create API keys in the Settings area of Modelbit, and send them to deployments in the
Authorizationheader. - Large responses can be returned as links instead of streams, for clients who prefer that format.
- Async responses are helpful when your client cannot wait for a response and wants a callback instead.
- Response timeouts can be adjusted to change the amount of time an inference is allowed to run.