This introduction to MLOps is intended as an introduction to the field, it's similarities and differences compared with DevOps, and how it can help organizations navigate common issues encountered in everyday work.
If you have ever worked with a Machine Learning (ML) model in a production environment, you might have heard of MLOps. The term explains the concept of optimizing the ML lifecycle by bridging the gap between design, model development, and operation processes.
Nowadays, MLOps is not just a concept but a much-discussed aspect of machine learning that is growing in importance every day as more and more teams are working on implementing AI solutions for real-world use cases. If done right, it helps teams all around the globe to develop and deploy ML solutions much faster.
Usually, when reading up on the term, MLOps is referred to as DevOps for Machine Learning. That is why the easiest way to understand the MLOps concept is to return to its origins and draw a parallel between it and DevOps.
Let’s jump in.
Feel free to skip this part for those of you who already know.
Development Operations is a set of practices that combines software development (Dev), testing, and IT operations (Ops). DevOps aims to turn these separate processes into a continuous pipeline of interconnected steps. So, if you follow the DevOps philosophy, you will shorten the systems development life cycle and provide continuous delivery with high software quality.
The core principles of DevOps are processes automation, feedback loops, and the CI/CD concept.
In software engineering, CI/CD loop refers to the combined practices of continuous integration (CI), continuous delivery (CD), and continuous deployment. Let’s define these terms and check how the loop works.
CI is the practice of automating code changes from multiple contributors into a single software project. So, CI helps to optimize the code changes.
CD provides automated and consistent code delivery to various environments, for example, testing or development. When the newest iteration of the code is delivered and passes the automated tests, it is time for continuous deployment that automatically deploys the updated version into production.
To simplify, CI is a set of practices performed during the coding stage, whereas CD practices are applied whenever the code is ready.
So, the CI/CD loop combines software development, testing, and deployment processes into one workflow. It heavily relies on automation and aims to accelerate software development. With CI/CD, development teams can quickly deploy minor edits or features. Thus, you can develop software in quick iterations while keeping its quality.
Moreover, since programmers can spend less time manually coding, deploying changes, and doing other routine tasks, they can focus more on the customers’ requests to update some functionality or create new features. With DevOps, when working in short cycles, your users will not have to wait long for some big release to get an upgraded version of an application.
To summarize, with DevOps, you will be able to improve communication and collaboration between your development and operation teams to increase the speed and quality of software development and deployment.
With the basic knowledge of the DevOps concept and its benefits, let’s move on to MLOps. Here things are a bit different. Today, ML model development and operations are often entirely separate. Also, the deployment process is manual. Therefore, building and maintaining a solution might take longer than expected. Machine Learning (ML) Operations (Ops) is a set of techniques used to optimize the entire ML lifecycle. Its aim is to bridge the gap between design, model development, and operations.
MLOps focuses on combining all the stages of the ML lifecycle into a single process workflow. Such a goal requires collaboration and communication between many departments in a company. However, if you manage to achieve it, MLOps provides a common understanding of how ML solutions are developed and maintained to all stakeholders. It is similar to what DevOps does for software.
The key MLOps principles are:
The core practices of MLOps are continuous integration (CI), continuous delivery (CD), continuous training (CT), and continuous monitoring (CM). Let’s cover each of them:
So, the MLOps loop is pretty similar to the DevOps one with slight adjustments that are ML-specific.
Following the MLOps philosophy when developing an ML solution has many benefits. To give a short overview, they include:
To sum up, with MLOps, you can deploy an ML training pipeline that can automate the retraining and deployment of new models, which is way better than deploying a single model available via an API endpoint.
As you might have noticed, there are a lot of similarities between MLOps and DevOps concepts. It should not be a surprise because MLOps borrows a lot of principles developed for DevOps.
Both DevOps and MLOps concepts encourage and facilitate collaboration between the development teams, for example, programmers, ML engineers, employees who manage the IT infrastructure, and other stakeholders. Also, both aim to automate the continuous development processes to maximize the speed and efficiency of your engineering team.
However, despite DevOps and MLOps sharing similar principles, it is impossible to take DevOps tools and straightforwardly use them to work on an ML project. Unfortunately, the devil is in detail, so MLOps has some ML-specific requirements. Let’s check them out.
The first thing you should keep in mind is the versioning differences between these two concepts. In DevOps, it is pretty straightforward as you use versioning to provide clear documentation of any changes or adjustments made to the software under development. So, 99,9% of the time, it is only about the code. That is why in DevOps, we usually refer to versioning as code versioning.
However, when working on a Machine Learning project, code is not the only thing that might change. In addition to the code, MLOps aims to keep an eye on the versions of the data, hyperparameters, logs, and the ML model itself.
Second, if you have ever worked on an ML project, you might know that training an ML model requires a lot of computational resources. For most software projects, the solution’s build time is entirely irrelevant, and therefore, the hardware does not play a significant part. Unfortunately, in ML, the situation is vice versa. It might take plenty of time to train larger ML models, even if you use large GPU clusters. Thus, there are more stringent hardware testing and monitoring requirements in MLOps.
Last but not least, DevOps and MLOps have a difference in monitoring approaches. In software development, the characteristics of your solution might not need any changes over time, whereas in Machine Learning, ML models must change to stay competitive. In ML, once you deploy the model into production, it starts working on the data it receives from the real world. Real-life data is constantly changing and adapting as the business environment changes. So, the quality of the model decreases as time proceeds. MLOps provides automated procedures that facilitate continuous monitoring, model retraining, and deployment to minimize this problem. Thus, the model will remain up-to-date and keep its performance on the same level.
It is no secret that there are many obstacles you might face when developing, deploying, or operating a Machine Learning solution. However, it is always better to know your enemy before you meet him to develop a potential solution. So, let’s identify them:
However, with MLOps, you will address these issues and get some additional benefits:
Unfortunately, even if your ML model is in production and served well, you are far from done. As mentioned above, an ML model might underperform in the production environment because of the model decay. Strictly speaking, the model decay term refers to the phenomena of an ML model’s accuracy decreasing over time.
Unfortunately, there is not much you can do to prevent the decay ahead of time. If it occurs, it occurs. It happens because your model is not operating in a vacuum but in the ever-changing production landscape. The world is in a constant state of change, and the data follows that. Moreover, the data your model was initially trained on is also likely to differ from the real-life data. Chances that you covered all edge cases when initially training your model are low. Thus, you can expect a mismatch between the data the model saw when training and what it sees in production. And you might face the model decay as a result.
You might come up with an idea to adapt your model over time to avoid this problem. However, it is tough to fix model decay on the model side as the lack of performance is not the model’s direct fault. It happens because of changes in the data.
Let’s take a look at a simple example. Imagine working on a vision AI project that detects whether a person’s eyes are open or closed. As a training set, you use plenty of images of humans’ eyes, but none with glasses. So, when you deploy your model, you will find out that your model underperforms on the glasses use case. The glasses use case is data shift, and the whole situation is the model decay.
However, if you have not noticed the model decay but have already moved on to another project, it might take a while before you see that your previous model underperforms. So, it would help if you had a modern solution that would automatically detect the decay and address the problem to avoid this. And that is where MLOps can back you up. With a well-rounded MLOps tool, you will be able to set up a CT/CI/CD pipeline that will automatically detect the decay, retrain the model and change the model in production to an updated version.
Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:
All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.
You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.
We took a deep dive into the Computer Vision datasets field and are ready to share our findings. Check out the page for …
This article will look at one of the most complex decisions for most organizations starting new AI projects. Should …
% of vision AI teams are not because data is their bottleneck. We fix this.