Datasets in Modelbit
Modelbit Datasets are DataFrames for use as feature stores in deployments and training data for training new models. Datasets are created from the results of queries run on your SQL warehouse.
Creating a dataset
If you haven't already, connect a SQL Warehouse to Modelbit.
In the Datasets tab, click New Dataset
. Use the SQL editor to create a query that returns the data you want in your dataset.
Save your Dataset to make it available for use in training and deployments.
Using a dataset
In your Python environment, use mb.get_dataset to download your dataset:
# Fetch the entire dataset as a pandas DataFrame
df = mb.get_dataset("your_dataset")
# Or, fetch only certain rows from the dataset as a pandas DataFrame
df = mb.get_dataset("your_dataset", filters={ "CUSTOMER_ID": customerId })