How do you best structure ML teams? In this article, we break down different approaches and go through the pros and cons of these approaches. Finally, we recommend approaches depending on company type.
With nearly limitless quantities of data, growth of inexpensive storage, and powerful processing, ML technology is getting rapid adoption. Many industries have opted for building new ML solutions in their businesses. But, as many are finding out, developing ML solutions is a difficult endeavor. Companies new to the field face many pitfalls and problems on their journey to getting AI models to production. This is where the organization of the ML teams becomes crucial. With the right organization, you’ve laid most of the groundwork for any and all ML projects.
The organization of the people may vary from company to company. It depends on the significance of ML in their tool, the size of the company, and many other factors. In this post, we will go through two different approaches of how to build teams and share our experience of how we see those different approaches working. But why should we do this?
Except for being an ML company ourselves, we have found ourselves interacting with and helping over one hundred teams looking to develop AI solutions in our everyday lives. We’ve seen quite a few different structures and have formed our own opinions on best practices with that in mind.
A centralized ML team consists of, mainly, Machine Learning Engineers and Data Scientists. They work separately in their fields from traditional software engineering, product, or any other department.
This independent group of ML engineers and Data Scientists means that the talent density within the ML team is very high. Thus, the company can employ the best ML practitioners for any relevant task, no matter where in the organization that task originates from.
With an independent ML team, the company can kick off any new Machine Learning ideas extremely fast. A centralized ML team also maximizes knowledge sharing across the function, thus allowing a depth of knowledge to be developed, standards to be set, and a shared tech stack to be created.
Indeed, an excellent centralized team can create processes and technology that can be shared with other groups. This hub approach means that over time, an organization can spread ML know-how to other teams and enable different parts of the organization to conduct their own AI experiments.
Having this hub of knowledge also makes it easier to hire for new openings, as teams organized in this manner are very attractive to potential hires.
Furthermore, it creates a clear and dedicated space for any and all ML initiatives. It creates a natural communication pathway for other people who want to build AI solutions. With the right internal promotion, most centralized teams are very in-demand internally.
What is essential if you decide to go down this road is to ensure that the ML team has a seat at the table on a leadership level. They are often hard to fit in under existing leadership structures. If AI efforts are essential to the company, you want someone representing those efforts high up in the organization.
Although centralized teams are an excellent way to organize your ML workforce, this approach has some potential drawbacks.
First and foremost, it can become a silo of knowledge. If the team is not integrated correctly with the rest of the organization, chances are that knowledge sharing from the team to the rest of the organization simply doesn’t happen. Therefore, it’s important to ensure processes are in place for knowledge sharing and educating other teams.
Secondly, as the team often (but not always) doesn’t have its own software development resources, it needs to work well with teams. This shared responsibility approach can work well, but be careful to ensure that all involved teams align around the priority of work tasks and the goals of any projects.
Thirdly, a famous centralized ML team often faces the problem of having too many use cases to pick from. Throughout the organization, new ideas will flow to the team. So a centralized team needs reasonable and transparent processes for prioritizing tasks and becoming masters in stakeholder management.
Here, it is also essential to be clear with other decision makers what the internal priorities are. For example, it’s perfectly valid to say that you need time to build up MLOps infrastructure. This will help you develop AI projects faster down the line when you build the team. However, other groups and higher-ups might get frustrated by the perceived lack of progress without agreed-on expectations.
Another issue that can pop up is the lack of expertise in understanding the business. For example, you can have a very competent AI team technically. But if the team doesn’t understand what the core business of the company is, what the needs of its customers are, and the overall strategy of the business (to name a few examples), there’s a risk that they will spend their time developing something that doesn’t hit the mark. Here, it’s vital to ensure that other teams share their knowledge with the ML team.
Then, you have the issue of AI being very hard to estimate. In large organizations especially, when many different teams work together, there’s often a clear project plan with set deadlines and budgets. This can be difficult for an AI team to deal with as many AI development cases are trial-and-error. It’s tough to answer questions about “How much data is needed” or “How long to develop an AI model that produces X result” without first experimenting. At the core of this problem is the black-box nature of AI. You won’t know what will work until you try. A good recommendation is to give your centralized team their own budget and ensure that any project planning takes experimentation into account.
Finally, the team needs to have a clear function beyond “developing AI.” What is the core function of the team? Is it R&D of future technologies, or is it to improve business metrics in the coming months by extending the product range with AI capabilities?
If their function is unclear, it can be challenging for centralized teams to contribute to what the organization expects, leading to unhappy stakeholders and employees.
A decentralized ML team consists of an entire “feature” team made up of people from product, marketing, software engineering, design, and machine learning. The goal of this team is to develop a specific feature or product.
Decentralized teams are more product-focused and have a high concentration of product knowledge. A cross-functional team means you have all the knowledge and expertise to deliver the right thing at great speed. This also allows for easy experimentation with different machine learning ideas.
With these more diverse teams, you also create more independence. They don’t need to rely on other groups to build a complete product. This can be very helpful for an organization as you don’t need different teams to align on their goals and priorities and, generally, leads to faster development times.
For instance, when LinkedIn ML engineers couldn’t try their “recommendation engine” because they couldn’t get a front-end, they switched to a decentralized squad to test the “People You May Know” feature. The group included design, web, product marketing, engineering, and the PYMK feature became one of the most successful products a LinkedIn team ever created.
Generally (but not always), these teams have a more precise focus than centralized teams. Often, they have a clear reason for existing – creating new products and services. This makes it easier for the team to prioritize work and prevents losing focus.
As these teams are made up of people from different parts of the organization, they also tend to understand the organization at large better. For example, suppose you bring in a backend engineer that has worked in the company for a couple of years previously. In that case, they will be familiar with the technology the company has built elsewhere and will better understand what you need (and don’t need) to integrate with it.
Finally, setting up these teams can also lead to better knowledge sharing. Your ML engineers and data scientists might help the rest of the team grasp and get into machine learning. For example, a DevOps engineer can transition to being an MLOps engineer. But, it also works the other way around, with the larger team sharing knowledge on anything from how the organization works to development processes. In short, good decentralized teams often have symbiotic relationships internally that benefit all parties.
The main advantages of decentralized teams are speed and independence. This makes these teams very good when doing practical work and delivering value to the organization. However, this team structure can be challenging to use if you are looking for more of an AI R&D setup.
Another potential issue to look out for is that the team makes decisions that don't align with the organization at large. For example, decentralized teams can use a new CI/CD service while the rest already have a different solution. This can make future hand-overs more complex and can lead to double-spending.
A third potential issue is that these teams sometimes create a silo between them and other development projects. As they are highly independent, it can be that they lack insight into what's happening elsewhere - which means good processes need to be in place to make sure the team is updated on others' progress and share their own with relevant stakeholders. This is especially important in terms of overall organizational strategy.
A fourth potential issue is ensuring knowledge sharing on the machine learning side across the organization. As you spread out your ML expertise, your ML personnel will be focused on solving the assigned problem. However, even if you have decentralized teams, it makes sense to have your ML engineers and data scientists across the organization meet regularly to align and discuss solutions to common problems shared across groups. A typical example of this is the development of MLOps practices that can be used in all teams to speed up the development of AI.
You might also have an issue when it comes to recruiting talent, especially if you are looking to make one ML hire per team. These positions, where a single person is responsible for everything ML, can be seen as less attractive than working in a centralized ML team. However, if you have processes to solve the issues outlined above, you should make a convincing argument for why that is not a problem for the candidate.
Finally, you can have a problem with where new AI initiatives go in the organization. As you have decentralized, function-oriented teams, it can be more challenging for the organization to know where new ideas should be directed and where teams without AI experience should get help.
It probably doesn't make sense for smaller startups to have a centralized team as you have a limited headcount and your focus lies in developing your product. Often, the focus lies on delivering today (preferably yesterday), so fast development and decision-making processes are of the highest importance. Therefore, a decentralized approach is preferable for most.
However, one exception here could be that you are pushing the state-of-the-art in AI. If you are one of these R&D-heavy startups, it can make sense to concentrate your machine learning expertise in one team as you often are doing exploratory work that stands to benefit the company in the coming years, not weeks.
A data point here can be our structure in Hasty. Here, we have temporary decentralized teams assigned to build a specific functionality in one or a couple of sprints, after which people get reassigned to new teams. But, we also have weekly meetings with all ML personnel to share what they work on and debate how to build better supporting ML infrastructure.
The recommended approach for larger startups depends on what we wrote above concerning R&D vs. practical implementation. If you want to mainly implement AI into your product, decentralized teams should make the most sense. If you're going to spend significant time and resources exploring new approaches to AI and then bringing them into your product at a later stage, a centralized team might make the most sense.
For enterprises, it might make sense to think about using both approaches outlined above. Of course, this also depends on what you prioritize. Do you focus on implementation or R&D?
However, from what we've seen, the answer is usually both. To that, it could make sense to have a core centralized AI team that works on creating infrastructure, developing best practices, and evaluating state-of-the-art approaches. You can then add decentralized units to build applications and implement AI across different business functions. This approach can be very beneficial as you are getting the best of both systems, but it demands clear responsibilities between the teams to not cause friction.
There's also a third option we've seen sometimes: creating a centralized team with development and product capabilities. Essentially, these teams work almost as a separate startup within the organization and can develop AI capabilities across your product and service range. This can ensure better prioritization as the team can pick which projects will benefit the company the most, not only their particular section. Of course, you need to integrate these teams exceptionally well with the rest of the organization to deliver value.
Whichever type of ML model one opts for, the only thing that counts in the end is turning this ML into business value. Both models aren’t perfect, and we should realize the weaknesses of each.
For example, Airbnb realized that the central approach lacks product expertise, so they created an education format for their data scientists to foster their product engineering skills.
Consequently, this increased product knowledge and a sense of product ownership among data scientists. This also improves end-to-end customer experience and aligns with product design principles and practices.
In the end, it will come down to figuring out what works best for you and fixing issues as you go.
However, we hope this article has been of help when it comes to outlining the benefits and potential pitfalls of both approaches.
Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:
All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.
You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.
In our previous article in the series, we looked at how you can prioritize AI projects and gave a quick …
Practical machine learning is about much more than statistics and model architectures only. Resilient and reliable …
For 80% of vision AI teams, data is the bottleneck. Not with us.