CloudFactory has acquired to offer a complete end-to-end Vision AI solution.
20.01.2021 — Tobias Schaffrath Rosario

How to nail your annotation strategy

A guide to finding the right approach for your use-case.

How to nail your annotation strategy

A lot has been written about the different annotation strategies for computer vision projects explaining the differences between object detection, instance, and semantic segmentation. But if it comes to the question, which one is the right one for you, the answer is always: 'It depends on your use-case.' This is why we created the infographic below to help you find the right approach for your particular use-case.

1 pIqRdurd8Y9CYwANK8iWtg
This infographic is under the 'CC BY-SA' license. So you can share it with everyone you know. Link to the high-resolution image

The different annotation strategies explained 

Either if you're new to the topic or want to refresh your knowledge, we have a small overview of the three main annotation strategies for you.

Object Detection 

Object detection is used to locate discrete objects in an image. The annotation is relatively simple as one simply has to draw a tight box around the intended object. The benefits here are that storing this information and the required computations are relatively light. The drawback is that the 'noise' in the box - the 'background' captured - often interferes with the model learning the object's shape and size. Thus, this method struggles when there is a high level of 'occlusion' (overlapping or obstructed objects) or high variance in an object's shape. That information is important - think of types of biological cells or dresses.

An example for an object detection label. It is less computational effort to train an object detection model, but it has its limitations when objects overlap each other or differ across a class's instances.

The most well-known model architecture for object detection is the infamous YOLO v3 by Joseph Redmon. SOTA ("State of the art") at the moment is the Cascade Eff-B7 NAS-FPN architecture.

Semantic Segmentation

This is useful in indicating the shape of something where the count is not important such as the sky, road, or background. The benefits here are that there is much richer information on the entire image as you annotate every pixel. Your goal is to know exactly where regions are and their shape. The challenge with this method is that every pixel needs to be annotated, and the process is time-consuming and error-prone. Also, it is not possible to differentiate single instances of one class. I.e., the final model will only be able to tell if a pixel belongs to a car or not. But not how many cars are in an image.

0 bgz rmurWrfljLzg
An example for semantic segmentation. It is impossible to distinguish between different instances of a class, for example, the different cars or persons.

SOTA on the Cityscapes dataset-which is often used for training in autonomous driving use-cases where semantic segmentation is used a lot-is HRNet-OCR.

Instance Segmentation

This is useful in indicating discrete objects such as car 1, car 2, flower a, flower b, or actuator. The benefits are that objects' shapes and attributes are learned far faster, having to be shown fewer examples, and occlusions are handled much better than with object detection. The challenge is that this method has a very time-consuming and error-prone annotation process.

1 yKaVjvDmP1sBFu iXSWMJg
An example for instance segmentation on an image taken on a trip to San Francisco. The single persons are recognized as single instances.

The most used model architecture for instance segmentation is Mask R-CNN. SOTA is the Cascade Eff-B7 NAS-FPN (as for object detection as well).

Some general advice for annotating images

From working on practical computer vision projects daily here at, we learned quite a lot about how to approach the image annotation process. Generally speaking, there are three tips we can give you along the way (no matter which annotation strategy you use):

  1. Annotate as much data as you can yourself, don't just outsource it to another company or delegate it to the intern. Labeling the data yourself will help you develop a deep understanding of your data and detect issues like data shift early on. Make sure to use an annotation platform with high levels of annotation automation that allows you to annotate fast, so you don't spend weeks or months doing work that can be automated.
  2. Prototype quickly, don't wait until all the data is labeled. Only then, you'll be able to identify potential pitfalls only. And let's be honest, even the best ML engineers need to run through a few iterations before deploying a model to production. Thus, when choosing an annotation platform, make sure that it allows you to work in an agile fashion
  3. Double-check on the quality of your data. A model's quality is limited by the quality of the data fed to it. So make sure to have a quality assurance process for your data in place. SOTA annotation platforms provide features that leverage neural networks to do this for you.

We hope that our infographic and post provide some value to you and help you to get started with your annotation strategy. If you have any questions, or comments reach out to me at [email protected] or join the discussion in our community

Shameless plug time

Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:

All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.

You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.

Keep reading

Get AI confident. Start using Hasty today.

Automate 90% of the work, reduce your time to deployment by 40%, and replace your whole ML software stack with our platform.

Start for free Check out our services