Skip to main content

Example Prince PCA deployment

In this example we'll train a sklearn.linear_model.LogisticRegression on the output of a prince Principal Component Analysis that reduces the factors of the Iris sample dataset from 4 to 2.

First, import and log in to Modelbit:

import modelbit
mb = modelbit.login()

We'll reduce the four primary features (Sepal length, Sepal width, Petal length, and Petal width) of the Iris dataset to two latent features (n_components=2) using PCA:

import pandas as pd
import prince
from sklearn import datasets, linear_model

X, y = datasets.load_iris(return_X_y=True)
X = pd.DataFrame(
data=X,
columns=['Sepal length', 'Sepal width', 'Petal length', 'Petal width'])

pca = prince.PCA(
n_components=2,
n_iter=3,
rescale_with_mean=True,
rescale_with_std=True,
copy=True,
check_input=True,
engine='auto',
random_state=42
)
pca = pca.fit(X)

Then we'll train a LogisticRegression on the Iris dataset, after it's been transformed by Prince:

lr = linear_model.LogisticRegression()
lr.fit(pca.transform(X), y)

Finally, we'll make a deployment functio that returns the predicted name of the flower. Since we are defining label_dict outside of the deployment function Modelbit will automatically include it as a dependency in the deployment:

label_dict = {0: 'Setosa', 1: 'Versicolor', 2: 'Virginica'}

def predict_with_prince(flower_features) -> str:
input_df = pd.DataFrame.from_dict(flower_features)
input_pca = pca.transform(input_df)
return label_dict[lr.predict(input_pca)[0]]

mb.deploy(predict_with_prince, python_packages=["prince==0.6.3", "scikit-learn==0.23.2"], python_version="3.7")

The deployment can then be called from REST, Snowflake, and Redshift.