CloudFactory launches Accelerated Annotation after acquiring
17.01.2022 — Vladimir

Buy or build ML solutions

This article will look at one of the most complex decisions for most organizations starting new AI projects. Should they buy or build the software they’ll use to develop their AI models? We’ll present arguments for both sides and share some of our experiences from discussions we had in the past.

Buy or build ML solutions

As this article is very long, let’s start with the summary.



Our take (of course with an obvious bias) is that it doesn’t make sense for most organizations to build software across the AI lifecycle. What’s available on the market is getting so good that the chances of you building something better is low. After all, we are dealing with a space that has had billions of euros and dollars invested in it in the last five years.

Of course, there are exceptions to this. As we outlined, commercial software has its disadvantages. But, considering the downside of building something yourself, we strongly recommend that you evaluate the market before making something yourself.

Build AI software if…

Buy AI software if…

With the summary done, let’s explain how we got to these conclusions:

A background on how the AI software market has changed

The push to get AI into production environments started around a decade ago. At that point, early adopters began to look at using AI for very particular purposes - mainly autonomous driving. Then, there wasn’t a lot of AI software on the market. If you wanted to work with AI, you had to build the tooling yourself.

This was the case for most of the last decade. However, at the end of the decade, many software and tooling became available on the open market. First, we saw many open source projects created to solve particular problems in the AI development process. Then, we also started seeing commercial players like ourselves entering the space.

Today, there are many different options available for you to use if you want to buy software. However, the core question remains - should you purchase something off-the-shelf or build it yourself?

Understanding your needs

To answer that question, first, you need to understand your needs and how they will evolve. A good exercise can be to answer the following questions:

1. How core is AI development to our organization?

This is an essential question for organizations as it is probably the most important deciding if you want to build or buy a solution. Essentially, you need to figure out if AI is at the heart of your organization’s future. To give some examples:

Understanding this is important as it is one of the main factors deciding if you build or buy.

2. What approach do we want to take to AI?

There are many different approaches to AI. In vision AI, for example, you can do anything from image classification to instance segmentation.

Uncertain which approach to use? We have an article for that too:

Deciding on your approach to AI is one of the most important decisions you will make, as it will lock you in for the foreseeable future in that approach.

For more information on picking the right approach, you can read a previous article we did on the subject here.

3. What are our requirements for the software?

Different organizations have different needs. Mapping out in terms of what you need in terms of both core functionality (i.e we should be able to do train an ML model and compare different models with each other) and in supplementary functionality (it would help us if we can delegate work in the software itself) will give you a shopping list of needs that can then be used to source software.

Internally, when we source software, we use the MOSCOW system to map out what features we must have, what we should have, and what would be a bonus but not crucial.

4. What is our budget?

Another crucial requirement that you need to understand is your budget. Do you have millions to spend or a couple of thousand? Depending on your answer, you will know if it’s even feasible to build and get an upper limit for how much you can spend if you decide to buy.

How answering these questions help you in making a decision

With these needs and requirements mapped out, you now have the initial data you need to decide on the first steps on how to proceed.

From here, generally, our advice would always be to first look at what’s available on the market. The reason for that is two-fold. First, there’s so much available AI software today that chances are someone will provide what you need. Secondly, the cost of building AI software can be pretty high.

Looking at ourselves and our competitors as examples, the cost for developing commercial-grade AI software ranges from five to hundreds of millions. So any buying decision will likely be a massive investment. We’ve seen that building a complete AI software solution is out-of-scope for everyone but the most prominent players.

There’s still room to build software for specific parts of your AI pipeline. Here, understanding how AI is core to your business is essential. For example, an AI consultancy might say that their core competency is developing state-of-the-art models for their clients. For them, it might make sense to build software for that specific part of the AI development process as it could give them a competitive advantage while buying solutions for other tasks such as data annotation.

Another example is a medical company where data privacy and security are critical. Although they might decide to buy most tooling, building their data lake and integrating their chosen software with their data lake could make sense. That way, they control what’s essential for them.

With all that said, let’s look at some advantages and disadvantages of the different choices.

The advantages of building it yourself


It almost goes without saying, but building it yourself gives you complete control. You can create what you and your organization want without compromises, and any new needs will automatically be the top priority for development.

Particular needs on core functionality

Linked to a large extent to control are particular needs. Today, there is software for all of the common approaches to AI. But you might be in a field with specific requirements that commercially available tooling doesn’t solve. For example, the audio ML space is underserved in terms of available software, so you might have to build it yourself if you have a complex use case in that space.

Building the correct supplementary functionality

Another common reason to build your software is that you need particular supplementary functionality. For example, let’s say you have very exact needs for monitoring a model on edge, where you need to use state-of-the-art encryption for any communication between your software and your model. The chances are that most ML monitoring software doesn’t support what you need in terms of encryption yet, so this would require you having to build it.

Complex integration requirements

Another common reason (and advantage) for developing software in-house is that you have precise requirements in terms of integrations. Maybe you’ve spent a lot of time and money on building a data lake with complex data - let’s say 3D video data for the sake of argument). Also, you have vast amounts of essential metadata structured in complex ontologies. Now you want any software you use to integrate with that data lake. Here, you might have trouble finding a commercial solution as you are working with a data type that’s relatively new to the field, and you require the solution to handle complex ontologies. Chances of finding both in one commercially available software are low.

Creating a competitive advantage

There’s also an argument to be made that you can create a competitive advantage by building the right solution in-house. The focus here lies on it being the right solution as you don’t want to spend millions of dollars or euros making something with a similar feature set that competitors can just buy off the shelf.

To give an example: If you decided to build your annotation tool in 2014, that would have made a lot of sense. At the time, the tooling available on the market wasn’t very mature, and the competitive advantage you could gain compared to others without a massive investment was quite huge.

Today that is no longer the case for most standard annotation methods. So sinking a couple of million into creating an annotation tool - with further maintenance costs to come - is probably not the smartest choice.

However, there are always new frontiers in AI where innovative or very specialized companies can gain an advantage by developing in-house.

Data governance

Another factor in building things yourself is data governance. Essentially, here we are talking about who owns the data. For many organizations, data privacy and security are essential, and it’s becoming increasingly important with regulations like GDPR coming into force around the world. However, the commercial market is adapting to these regulations too. Hasty, as an example, is a European company. As we are directly affected by GDPR, we have done a lot of work making sure we are compliant. We also offer users solutions for keeping data in their environments while still using our solution. We are not alone in this. Most commercial companies understand that this is a growing need, so it’s not as much of a reason to build as it used to be.

Disadvantages of building it yourself


For most organizations, the biggest issue with building something yourself is time-to-value. Essentially, if you decide to make it, you can expect at least a couple of months of initial development time, followed by a month or two of testing and bug fixing before you have something useable. If what you require can be bought instead, you have spent months building something you could have purchased and started using in a day instead.

Opportunity cost

Another disadvantage is the opportunity cost - i.e., what could the team building your new solution have done instead?

For most organizations we talk to, the goal is to build the best possible AI model to solve a specific problem. You do this by creating great data and spending a lot of time and effort training and benchmarking different models. Adding software development to that equation means less time and budget for creating that model.

Here, you must know the answer to what parts of the AI lifecycle are at the absolute core of your business. If what you are building isn’t part of that core, chances are you are wasting resources that would give you more of a return elsewhere.

Financial cost

We’ve already talked about it, but building software yourself can be very expensive. Before you start building, you better be sure that you need to do so. The cost for building solutions of the same quality and feature sets as you can find on the market is high and goes into millions of euros or dollars.


Often overlooked and underestimated, when building software, you also have to consider the maintenance that goes with it. From the discussions we had with teams; usually, you start with a smaller scope for what your software is supposed to do. Over time, as your needs change, that scope starts creeping, new features are required, and you have to develop a new version of your software. This, combined with everyday maintenance and bug fixing, quickly builds up and leads to high costs both in time and money.

Maintenance is also a considerable bottleneck if what you build is highly integrated into other parts of your software stack. If that’s the case, it means you have to maintain your software and the integrations to other software. If something else changes somewhere else, you have to react to that.


Additionally, another important aspect here is taking on all of the risks. The project could go over budget or over time. Any data leaks happening are your sole responsibility etc. As you are likely building your solution from scratch, it’s not a question if something goes wrong (from my own experience, it does) but how critical a problem it becomes.

The advantages of buying AI software


Of course, the main reason for buying anything is speed. You pay, and then you can start working. Compared with spending months building software yourself, you can get started right away. As most teams we talk to have goals they need to achieve in the near future, they often cite this as the main reason for buying solutions.


The second most crucial aspect in teams choosing to buy solutions is cost. Comparing the budgets needed for building something similar makes most software pricing very fair. This makes sense as you share development costs with all their other customers. So instead of spending millions, you are spending thousands (or even hundreds) of euros and dollars.

There is an asterisk here. Pricing for AI software varies greatly depending on the provider. For similar solutions, you can see a difference in the price of 10x or even more, depending on the provider. AI software is a new field, and there’s not yet any standardization in pricing. More prominent, better-known providers can ask for higher prices while newer competitors often ask for less.

Another aspect to take into account is pricing transparency. Some solutions require you to talk to sales before giving you a cost proposal. This makes it tricky for customers to choose a software until they’ve gotten a couple of proposals in. If you can excuse us for a second, this is something we are against here in Hasty. We think transparency in pricing is essential for potential customers to understand the cost when evaluating us.


A commercial software provider will also have built a lot of functionality over years of operating. Although it does not always seem relevant at first, there’s a reason that they built that functionality in the first place. Using Hasty as an example, we often find that teams that are new to the space and starting to use our product often focus on annotation and model building. Over time, for many users, it becomes clear that they need better quality control of their data. Luckily for them, that’s something we already built.

Another example is changing AI approaches. If you mid-stream decide to switch from doing object detection to instance segmentation, for example, you can do so without having to rebuild your own software from scratch.

In general, this means that the software you buy almost always can grow with you and that you don’t have to wait on your internal IT team shipping features.


The most significant positive effect of buying tooling instead of building it is the focus you gain. Instead of having an AI team with multiple functions - both building models and software - they can wholeheartedly create the best possible model.


Similar to functionality, many commercial software providers have already built standard integrations. This means less work for you when you integrate the solution with your AI pipeline.

No maintenance

Of course, another advantage of buying a solution is that you have maintenance included in the price. Except for giving you peace of mind, this also gives you greater cost certainty for the lifetime of your project(s) as you don’t have to worry about maintaining software.


Another overlooked advantage of buying a solution is that the provider often has valuable experience for your project. Using Hasty as an example, we often find ourselves helping new teams orientate the pitfalls and roadblocks that most companies experience when starting new AI projects. In that way, a good provider will provide you with a great product and act as a sounding board for you.

Disadvantages of buying AI software


Of course, if you buy something, you can’t control it. That means you are dependent on another organization and their customer support for any needs you have. Here, it can be a good idea to check the customer service record of any provider you are evaluating. Are user reviews positive? Do they have any guarantees regarding time to answer or time for fixing any issues?

If you are giving up control, it will be vital to understand who you are giving up control to and ensure that they are reliable.


Hey, we already had this as an advantage! Surely this must be a typo. Unfortunately not. Although most providers have built integrations that plug into user environments, there is a lack of existing integrations between different providers today. For you, that means that if you are planning on using more than one solution, you might have to build integrations between them yourself.

Lack of edge case feature coverage

Commercial tool providers develop with a market in mind. For us, it makes sense to go for the most significant market and build features for the most common use cases. This works well for 90% of cases, but it can be tricky to find a provider if you are in a niche field or working on rare use cases.


Another common problem with software providers for customers is an insight into what will come next or when you can expect that particular feature you need. Usually, companies have two different approaches to this. The first approach is to share what they are working on next publicly so that you as a user know what will be coming next. The second approach is more personal, with providers having a dialogue with you and keeping you up-to-date.

However, you will never have the same foresight into what’s going on as if you built something yourself.


Finally, you have the problem of customization. When using commercially available software, you often find that the software would be more beneficial to you with some tweaks. It can be something minor, like being able to link directly from the software to your internal documentation, or something more extensive like changing the logic of the interface in a way that better fits your workflows.

Although commercial providers often listen to your customization needs, from their perspective, what you propose needs to be relevant for the rest of their user base to if they are going to build it. Therefore, your ability to customize commercial software is limited.

Shameless plug time

Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:

All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.

You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.

Appendix A: A single end-to-end ML software solution Vs. multiple expert ones

If you have chosen to buy an ML solution, you will face another dilemma. As you might know, there are end-to-end ML solutions that support you through the whole ML lifecycle and smaller tools covering different aspects of the process. Let’s say that the first option can be categorized as platforms that aim to handle all your needs, and the second option can be called expert tools that fulfill one function you need. In the context of an AI project, this leaves you with a choice. Do you use an end-to-end solution or use multiple specialist tools?

On the one hand, using an end-to-end solution is an easy path. If you make the right choice, the tool will have all the capabilities you need to work on your Machine Learning project, and you will have no friction between different stages of development. However, an end-to-end solution covers all the stages of the ML lifecycle without specializing in any of them. Therefore, there might be tools with greater functionality for a particular stage.

On the other hand, expert tools have specialization and offer advanced capabilities in specific fields. However, having multiple specialist tooling means you will have to build integrations between them - today, there’s very little in terms of pre-built integrations for you to use. Unfortunately, this process is rather unobvious and time-consuming since you need to figure out whether the tools you have chosen even work together and how to make them work.

If you succeed in that, you will also spend some time setting up your working pipeline and solving the source-of-truth problem (you will need another software at this point, potentially). Still, your working pipeline will remain unstable since any minor change in the tool’s APIs can strongly affect it, resulting in a pipeline adaptation task. Thus, using many expert tools comes with maintenance costs.

To summarize, both options have advantages and disadvantages. Still, the primary question you need to answer to decide is whether you want the ease-of-use of one solution or need that expert functionality of multiple tools despite the complexity of using them.

Appendix B: Outsourcing AI software development

If you decide to build your software, you might be interested in outsourcing that project to a software development provider. This can seem like a viable solution, but it comes with its trickery.

Most software development services have little to no experience in building software for AI use cases. As these projects often require an understanding of how AI works, and expertise in integrating AI in the product itself, it can be difficult to find a provider with the expertise needed to build your solution.

Secondly, when outsourcing software development, you are giving up much of the control that probably made you choose to build the software yourself in the first place.

Finally, if you decide to go down this road, ensure that you have a clear hand-over plan as you, in all likelihood, don’t want to pay a software developer in perpetuity.

Keep reading

Unlock the power of data at scale

For 80% of vision AI teams, data is the bottleneck. Not with us.