Skip to main content

Using datasets

Modelbit Datasets are data frames that can be used as training data or as feature stores in deployed models.

Datasets are created from the results of SQL queries run on your SQL warehouse.

Creating a dataset

If you haven't already, connect a SQL Warehouse to Modelbit.

In the Datasets tab, click New Dataset. Use the SQL editor to create a query that returns the data you want in your dataset.

Save your Dataset to make it available for use in training and deployments.

Fetching a dataset

In your notebook use mb.get_dataset to download your dataset:

df = mb.get_dataset("your_dataset") # returns a pandas DataFrame

You can then use df as training data for your models.

Filtering a dataset

Instead of fetching the whole dataset, you can fetch specific rows and use your dataset as a feature store.

Read the next section on feature stores for more information.