Using Snowflake and Snowpark with Modelbit
When using Snowpark to run your Modelbit deployments the deployments run within your Snowflake warehouse. This has the benefit of allowing you to use your Snowflake compute capacity to run inferences and is especially useful for very large inference jobs. Like with SQL queries, the size of your warehouse is the primary factor determining the performance of running your deployment.
Enabling Snowpark
To use Modelbit deployments with Snowpark, make sure you've configured the Snowpark integration in Modelbit's settings area for your warehouse.
Modelbit will then create Snowpark User-Defined Functions (UDFs) within your warehouse for each deployment. Those UDFs execute your deployment and return results, just like any other SQL function. Sample code for calling your deployments from SQL are in the API Endpoints screen of each deployment in Modelbit.
Developing for Snowflake and Snowpark
Since your deployments are running within your Snowflake warehouse, there are some limitations to consider when creating deployments. Accounting for these limitations during development will make adding the completed deployment to your Snowflake warehouse easier.
Python Packages
The Snowpark environment that's running your deployment has access to a couple thousand Python packages and tens of thousands of versions of those packages. By contrast, PyPI has several hundred thousand packages and millions of versions. It's likely some of the packages you're using may not be in Snowpark.
When your packages aren't in Snowpark, you have several options:
1. Change the version you're using
If your Python package is available in Snowpark, but your version isn't, you can change the version you're using to one supported by Snowpark.
This is most likely to succeed with popular packages like scikit-learn
or spaCy
. To see what versions are available, run the following query:
select * from information_schema.packages where language = 'python';
The results will show the packages and versions that are available for each supported version of Python.
2. Add the package as a Modelbit private package
You can use Modelbit's private packages feature to "vendor" packages from PyPI and make them available in Snowpark.
Head to your package's page on PyPI, click Release history
, and download the version you need. Then use mb.add_package
to upload that version to Modelbit. The next time you deploy, Modelbit will find that package in your private packages and add it to the Snowpark environment.
This works for Python packages that don't have side effects (like downloading files during installation) or require compiling C extensions.
3. Ask Snowflake to add your package
The Snowflake Snowpark team frequently adds new packages to the Snowpark environment. Reach out to your Snowflake account manager and they can help prioritize adding the packages you need to Snowpark.
Hardware configurations
The standard Snowflake warehouse has limited disk space (about 0.5GB) and RAM. This means that deployments depending on large pickles or checkpoint files may not work.
One alternative is to upgrade to a Snowpark-optimized warehouse for significantly more RAM and disk space.
GPUs are not supported in Snowpark environments at this time.