Using datasets
Modelbit Datasets can be used for snap-shotting training data as well as feature stores for deployed models.
To access a list of available datasets, in your notebook run:
mb.datasets()
The output will list your datasets:
Name Date Refreshed SQL Updated Rows Bytes
taxis 2 days ago 3 days ago 85,000 6 MB
squirrels 3 hours ago 1 day ago 3,023 625 KB
nba games 10 minutes ago 10 minutes ago 626,111 68 MB
To download a dataset into memory as a Pandas DataFrame:
df = mb.get_dataset("nba games")
You can then use df
as training data for your models.
If instead you want to use the dataset within your deployment, move on to the next section on feature stores.