Skip to main content

Calling the REST API

Use the Modelbit REST API to add inferences from your deployments into products, websites, and apps.

If you're in a Python environment you can use modelbit.get_inference.

Single inference requests

The easiest way to fetch inferences from a deployment is one inference at a time. The single inference syntax is suited for use cases where you want predictions about one customer or event at a time.

curl -s -XPOST "https://<your-workspace-url>/v1/example/latest" \
-d '{"data": <inference-request>}'

Learn more about single inference REST requests.

Batch inference requests

To retrieve many inferences in a single REST call, use the batch request syntax . It's similar to the single request syntax, but instead takes a list of inference requests.

curl -s -XPOST "https://<your-workspace-url>/v1/example/latest" \
-d '{"data": [[1, <request-1>], [2, <request-2>]]}'

Learn more about batch inference REST requests.

Using API keys

You may limit access to your deployed models using API keys. You can create API keys in the Settings area of Modelbit, and send them to deployments in the Authorization header.

Learn more about REST requests with API keys.

The REST URL Format

The REST URL for your deployments can take a few different formats depending on your use case:

The simplest ends with <your-deployment>/latest, like this:


The /latest URL will always call the most recent version of your deployment.

Modelbit keeps all of your deployment's previous versions around so you can call them. To call a previous version of a deployment, replace latest with the deployment's version. For example, calling version 5 of example_deployment:


You can also create a version alias in Modelbit, and then move that alias between versions. This is useful when you want to check a single URL into your product and later change which deployment version that URL is pointing to. For example, alias called official looks like this:


See also

  • Large responses can be returned as links instead of streams, for clients who prefer that format.
  • Async responses are helpful when your client cannot wait for a response and wants a callback instead.
  • Response timeouts can be adjusted to change the amount of time an inference is allowed to run.