Avoid Building and Maintaining an MLOps Training Environment
AutoDevTech helps teams write better code by understanding code coverage, churn, and engagement. Using sophisticated machine learning techniques, the AutoDevTech platform accelerates a team’s efficiency by providing valuable insights into their development process in the context of industry norms, systematically exposing teams to best practices from some of the most well-crafted software.
Nick Gerner, AutoDevTech’s CEO/Founder and builder of many engineering teams, and Bora Banjanin, AutoDevTech’s lead Applied Scientist are tasked with the complex task of preserving the knowledge from the past and providing that knowledge to new engineers.
[AutoDevTech’s Review Assistant]
AutoDevTech’s objective is to help write and validate code, making engineers more efficient. When it came time to scale their machine learning training, they quickly realized they themselves needed to be more efficient. They saw using Grid as an opportunity to avoid the complexity of building and maintaining their own MLOps training environment.
Prior to using Grid, the AutoDevTech team focused on traditional classic statistical regression models. They wanted to leverage more sophisticated models and evaluated infrastructure solutions such as Horovod and Sagemaker which they realized would require a greater engineering effort to achieve the desired state-of-the-art performance.
“Looking into a service like Grid, we wanted to be using more sophisticated methods and the only way to do that was with large scale distributed training.” – Bora Banjanin, Applied Scientist, AutoDevTech
The team benefited from:
- Training from laptop to cloud without code changes
- Easily scale to a large number of clustered machines
- Open-source software and the Open-source Community
- Avoiding a complex MLOps project
- Affordable and transparent pricing
Having enjoyed many of the benefits offered by the PyTorch Lightning platform and community, it was an easy decision for AutoDevTech to leverage Grid and have one team support all their ML lifecycle needs. PyTorch Lightning already provided a significant amount of simplification with Lightning Trainer, and given their plans to use DeepSpeed integration in the future, it was natural to work in the Grid platform.
“We can now turn out large experiments on large-scale distributed models, allowing my engineers to make decisions on what to do next. How big is the context, should it be big or smaller, etc. I don’t think we could get answers without using Grid.ai.” – Nick Gerner, CEO / Founder, AutoDevTech
The services Grid offered enabled their machine learning engineers to focus on machine learning. To compete and move to market faster, the team needed more sophisticated methods, and the only way to achieve their goals was through large-scale distributed training. The team continued to leverage their own AWS environment with Grid as a tenant in their Virtual Private Cloud (VPC). This allowed AutoDevTech to leverage its existing platform while also taking advantage of Grid.
Grid Datastore management was an important function of the platform. Using the ability to pull data directly from the Grid. Datastore simplified data management and sped up development.
The simplicity of Grid Runs enabled teams to quickly determine which resources were available and easily leverage Spot instances to deliver maximum value. With Auto-resume, the AutoDevTech team will be able to restart instances when Spot instances are reclaimed.
Using a platform to keep up with the quickly changing machine learning space justified Nick’s decision to move to Grid. Additionally, the support from their staff, the PyTorch Lightning and Grid communities have made Grid a key component in maximizing the product they are serving their customers.
Getting Started with Grid:
Interested in learning more about how Grid can help you manage deep learning model development for your next project? Get started with Grid’s free community tier account (and get $25 in free credits!) by clicking here. Also, explore our documentation and join the Slack community to learn more about what the Grid platform can do for you.