Skip to main content

Modelbit's Git-based architecture

Modelbit stores all of your models, source code, environment requirements, and infrastructure configuration in a Git repository for you. Modelbit is backed by Git to make all of your Git-based workflows fit naturally with your Modelbit workflows. Much of the time, the buttons in the web app and the Python APIs are secretly just shortcuts for making and pushing Git commits!



Once a commit has been pushed to Modelbit the change is inspected and applied to your workspace. For example, if you changed some source code, then a new deployment version will get created and deployed. If you changed an environment's requirements, then a new pip install will run to build a new Docker image, and that image will get used with your model. Likewise, you can revert several commits and push that change to perform a rollback.

Branches are development environments

Creating a new Git branch creates fork of your Modelbit production environment for you to use as a development or staging environment. Just like any other change made in Git, changes made on one branch will not affect other branches. For example, changing models in the model registry only applies to deployments running on that branch. Deployments on other branches won't see the change.

This means you can test out changes to your models or their dependencies in an exact replica of your main branch. Your dev branch will come with the same high-performance REST APIs and infrastructure as main.

Connect GitHub (or others) for CI/CD

Once your development branch is ready to merge into main, you can have a peer review the changes with a pull request. Configure GitHub, GitLab or Azure DevOps and you'll be able to create pull requests, protect your main branch, and implement advanced CI/CD rules and tests.

You'll see all of your source code and configuration changes in the pull request, making it easy to know which changes are about to merge to main.

After you connect Modelbit to your Git provider, Modelbit continues to maintain a git repository for your workspace. This means Modelbit will continue working even if GitHub is down or even if you delete the repository on GitHub. Modelbit uses multi-master sync to keep the Modelbit copy of the repo in sync with your Git provider.

Large files are handled automatically

When you modelbit clone your local directory is automatically configured to work with large files. Large model pickles or checkpoints don't get stored in Git. Instead, the file's data is encrypted and stored in Modelbit while a "stub" file is stored in the repository. Stub file are small yaml files containing information about the original file.

That means whenever you git add a large file, a copy of the file's data is encrypted and uploaded to Modelbit and a stub file gets written to the Git repository. When you git pull, any new stub files are automatically replaced by downloading and decrypting the original file.

Working with large files in Modelbit's Git repository is like working with any other file. Special smudge and clean filters automatically handle your large files, so you won't need to think about it. You and your teammates' local Git repos will always have the original (non-stub) version of your large files.

Modelbit's git repo has a structure

Modelbit enforces a directory structure within the Git repository. Deployments, models, training jobs, and all the other assets stored in the Git repo are expected to be stored in specific directories. Additional non-Modelbit files and directories are not allowed within the Modelbit Git repository.

Before you run git add, run modelbit validate to check that your repository is correctly organized.