Going through the need for expert annotators and how you can minimize the cost of having expensive experts annotating for you.
If you have ever worked on a Machine Learning (ML) project, you know that the most vital component of the solution's success is a good data asset. For some lucky organizations, they can find that in publically available datasets. For most though, it means having to create it yourself.
The conventional data annotation approach is often a low-cost, quantity approach, requiring an in-house or outsourced manual labeling effort of enormous amounts of data. In many cases, a low-cost approach to labeling can even be ineffective and harmful for your project because:
We also see more and more advanced AI applications being built all over the world. With that in mind, we think it's time to reevaluate the annotation workforce.
If you think any annotator can effectively label any data asset, you might want to reconsider. This might seem obvious but there are still some in the vision AI space that thinks any annotator can perform any task with good enough guidance. Sure, back in the day, Computer Vision started with use cases that did not require any hard-to-find expertise. For example, a dog/cat classification task, ImageNet task, or even the autonomous driving problem. Most human beings could label data assets for all of these use-cases easily.
However, since you can now apply Vision AI algorithms in many industries, this pattern is changing. Nowadays, if you are working on a complex Artificial Intelligence (AI) solution for a certain industry, you often find yourself needing a specific expert to assist with annotation. For example, in healthcare, you need a doctor; in agriculture, you need a botanist, and so on. Only with expert help can you create an accurate and comprehensive dataset for your Machine Learning team to work with.
The most important factor here is the complexity of a project. As with all technology, the first use cases that organizations tackle are often the easiest ones. However, as the technology matures and as competition gets fiercer, complexity goes up.
Furthermore, you might need specific specialists in your field. In healthcare, most of the time it's not enough with a general practitioner. You will need a radiologist or a gynecologist to do the annotation for you.
What happens if you don't have the right experts? A lack of expertise is likely to lead to poor labeling and your model underperforming as a result. As most applied AI projects are working towards achieving a specific target metric, a badly annotated data asset might even kill your project. Like most things in life, data annotation quality tends to outperform quantity.
So let's say you're interested in working with real experts for your next AI project. You'll, quite quickly, find out that they don't come for cheap. For a basic annotation task, you might pay 5$ per hour per annotator - for an expert, that might be 25$ or even 100$ per hour. Some quick back-of-the-napkin calculations will tell you that bringing in experts is too costly for your project. Before deciding that though, there are two mitigating factors to consider.
The first factor to take into consideration here is that good experts will outperform lower-skilled annotators in terms of quality, and for complex use cases, speed. So even though the hourly rate is higher, the actual cost might be lower when factoring in an additional cost for QA and reannotation if working with a team that's not up to the task.
The second factor that's important to know is that you can use AI to automate a large chunk of the annotation and QA work for you. For example, we at Hasty have developed AI annotation assistance that both automatically train a model for you and learns as you label. To give you an example of what automation we can offer, here we are annotating PCB boards.
Here, we manage to annotate most of the PCB board automatically (The model has trained on 90 images)
As you can see, our model picks up 90% of all annotations in the image correctly. What that means for you, is that those high-paid experts cost 90% less per image annotated. Suddenly, the cost doesn't look that massive.
We also (as the only end-to-end software in the world) offer an AI-powered quality control feature called Error Finder. Error Finder reviews your annotations automatically and provides feedback on any cases where the model disagrees with your manual annotations. It looks like this:
By highlighting potential errors and letting you decide what's actually an error, you save up to 95% on quality control without any reduction in data quality
Quality control can be anything from 10–50% of the total data creation cost. It also requires expert knowledge, so being able to review potential issues, accepting the correct ones, and rejecting the bad ones will not only save you time (and reduce strain on your eyes), it will drastically reduce your annotation budget.
Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:
All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.
You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.
Our PASCAL VOC 2012 cleansing post was successful, so we did some follow-up experiments to explore the field a bit …
This article will look at one of the most complex decisions for most organizations starting new AI projects. Should …