All annotation is now free in Hasty.


How to speed up your Instance Segmentation task with Hasty

Nursulu Sagimbayeva

We gathered a dataset of 378 medieval paintings and labeled all the adults and children using the Hasty features. In just about 5 hours, we achieved the Mask mAP of 34.27 and a loss of 0.35 on the validation set. Check out the blog post to learn more about the Instance Segmentation pipeline in Hasty!

If you have ever worked on a Data Science-related task, you know that having correct labels is crucial for your model’s performance. Today we want to show you how Hasty can speed up the annotation process and save hours of your life.

Let’s jump in!

Project workflow

For this post, we opted for an Instance Segmentation use-case. We gathered a dataset of 378 medieval paintings and decided to train a model that labels Adults and Children in these images.

The general workflow was as follows:

  1. Annotating a part of the data to train an Instance Segmentation AI assistant;
  2. Applying the Automated Labeling feature to label the majority of the data automatically;
  3. Using the AI Consensus Scoring feature to evaluate the created labels and fix them if necessary;
  4. Using Model Playground to train a model.

Data annotation

Elapsed time: 10 minutes

Once you have created a Hasty project, click on any image to begin. You will find various annotation instruments in the toolbar on the left.

For an Instance Segmentation task, you will likely use an AI-powered Instance Segmentation assistant (IS). Like any supervised learning-based model, it needs a bit of training before making its predictions. To unlock the IS tool, we manually labeled 10 pictures so the assistant could learn from them.

Instance Segmentation assistant

For the initial data annotation, Hasty offers advanced AI-powered tools – DEXTR, Atom, and Box-to-instance. Manual tools such as Polygon, Brush, and Bounding box are also available.

  • A piece of advice: using hotkeys saves tons of time. Just trust us.

N.B.: After annotating each picture, you should change the image status from “New” to “Done” or “To review.” This signalizes the assistant that an image can be used in the training batch.

These statuses can be used interchangeably. In our case, we opted for “To review.”

After annotating 10 images, we trained the first model for the Instance Segmentation assistant.

The perk of using the IS assistant is that it retrains and improves continuously as we label more data. This makes the annotation process increasingly faster and easier.

Automated labeling

Elapsed time: 1 hour

Automated labeling is a Hasty feature that applies the IS assistant to all the images in your dataset with a single click. The function becomes available once the IS assistant is trained well and beats certain performance thresholds.

To unlock the Automated labeling feature, you should apply the IS tool to a couple of pictures and accept the suggested labels without editing.

Automated labeling feature

In our case, we unlocked the Automated labeling feature after labeling around 98 pictures. At this point, the IS assistant was trained well enough to produce accurate predictions that did not need editing. Then, we activated Automated labeling and took a break while the feature annotated the rest of the images.

Choose the IS assistant and start a run

As a result, 270 images were labeled automatically, which is more than 70% of the dataset. It took around 5 minutes. Annotating the same number of images manually would consume about 13,5 hours (given that labeling one image would take us 3 minutes, on average).

AI Consensus Scoring

Elapsed time: 2.5 hours (including waiting)

At this point, it is tempting to think that Automated labeling is the ultimate solution, and you can happily proceed further. However, it is always better to review your labels to check for potential mistakes.

AI Consensus Scoring is a Hasty feature that allows you to review your annotations. It automates the tedious task of manual data cleaning using SOTA AI approaches. The feature highlights the potential errors and suggests labels that you can accept or reject.

Potential errors detected by AI CS

To start the review, access the feature in the burger menu on the left, and create a new run.

AI Consensus Scoring

You need to specify:

  • The task (Instance segmentation review, in our case);
  • The statuses of the images you want to check;
  • And the classes you want to review.

Do not forget to toggle on the “Retrain model” flag to ensure that the models used to assess your labels are up to date and more likely to deliver top-notch results.

Create a new AI Consensus Scoring run

In our case, the majority of produced annotations were of high quality – especially when the following conditions were met:

  • There was only one person in the painting;
  • There were several humans that did not overlap;
  • The images were contrastive enough.
Successful automated annotations

However, in some cases, mistakes slipped into the predictions. This is not surprising given that the assistant was trained on a limited number of pictures. Moreover, there was high variability across the dataset: some pictures portrayed a single person, while some – many; people were painted in different poses and styles, and so on.

Here are the most common errors we came across:

  • Several people were labeled as one;
  • Some people were left undetected by the assistant at all (most likely, due to the low contrast or unusual poses);
  • Objects were identified as humans;
    Personally, we do not regard horses as less worthy than humans, but the imposed labels left us no choice.
  • Some body parts were detected or filled incorrectly;
  • Wrong labels: almost always, children were taken for adults or not recognized as separate objects at all. One should not blame the discriminatory beliefs of the model – probably, this happened due to a massive class imbalance between the Adult and Child classes in the dataset.
    Class imbalance

Correcting some errors took several seconds, while more complex ones took several minutes to edit. We changed the images’ status to “To review” once the annotations were adjusted to continuously improve the model.

The workflow at this step is the following:

  1. Run an AI Consensus Scoring review;
  2. Check the mistakes and change the status of the reviewed images to “Done” or “To review”;
  3. If you think the IS assistant has improved in the process, you can use the Automated labeling feature again. To do so, delete the old labels from the images you want to label automatically and change their status to “New”;
  4. Apply Automated labeling;
  5. Run an AI Consensus Scoring review once again;
  6. Repeat if needed.

Now it is time to customize the model configuration!

Model Playground

Elapsed time: 1.5 hours

Model Playground is a no-code solution offered by Hasty. It allows you to experiment with different neural network architectures and parameters. To access the feature, please use the burger menu on the left, as shown in the screenshot below.

Model Playground

First, we created a data split by clicking a button in the right corner of the screen.

Create a data split

In the split, you can specify various parameters of your future experiment, including the size of your train, test, and validation sets. In our case, we decided to include all the images in the split.

  • Please note that if you want to run experiments more quickly at the cost of statistical accuracy, you can reduce the size of your split.

After creating the split, we scheduled an experiment.

Schedule an experiment

Fortunately, Hasty offers a ready-made template with all the parameters optimized for an Instance Segmentation task.

For sure, the preset parameters are not a universal optimum, so you can play around and adapt them for your needs. You can select and tune the model's architecture to the smallest detail, use image augmentation, set the desired metrics, training parameters, and many more.

Still, for us, the template was just fine. We did not change the default settings much – only increased the number of iterations from 1000 to 5000 and the size of the train batch from 1 to 8.

Training parameters

Once the parameters are set, you can start the experiment, make a cup of matcha tea, open a YouTube video (for example, our recent tutorial about Instance Segmentation), and wait for the results. Depending on the dataset's size and selected parameters, it can take from several minutes to hours and even days.


Here are the results we managed to obtain.

Ideally, you should aim for the least possible loss – most notably, on a validation set. In our case, after around 500 iterations, the validation loss stopped at 0.35 and did not change much afterward, so we took 0.35 as a final number.

mean Average Precision

As for the metric, we used Mask mAP (mean Average Precision) to evaluate our model. The closer the metric value to 1 (or 100, if you count in percentages), the better your model predicts the instances. We achieved a mAP of 90.75 on the training and 34.27 on a validation set. This is a high discrepancy, and our model clearly overfits.

The result is not perfect, most likely due to the significant class imbalance and overall complexity of the images (low contrast messes up the model). In a real scenario, we could improve it by applying an oversampling of the Child class, adding and labeling more images with children, or using the Equalize augmentation. However, at this point, we decided to call it a day.


Overall time: 5 hours 10 minutes

We started off with a dataset of 378 pictures. We annotated 10 pictures and unlocked an AI-powered Instance Segmentation assistant.

After annotating 98 more images, we unlocked the Automated labeling feature, which automatically labeled the remaining 270 images.

Then we used the AI Consensus Scoring feature to review and adjust the labels suggested by the Automated labeling.

As a result, we got 851 labels in the dataset: 781 adults and 70 children. To train a model, we created a new split in Model Playground. 264 images (70%) were used as a training set, 76 images (20%) as a test set, and 38 (10%) as a validation set.

The last step was scheduling a Model Playground experiment with the Basic Instance Segmentor parameter template.

Due to a significant class imbalance – around 11:1 – the validation loss turned out pretty high: 0.35. The Mask mAP, conversely, was low: 34.27.

Visualization of the class imbalance

However, obtaining perfect results was not our primary goal. What we wanted to do here is to describe the general workflow for an Instance Segmentation task that you can implement with Hasty. As you see, with the help of AI-powered annotation, we managed to accomplish a lot in just a couple of hours :)

How Hasty can help…

If you are looking for a quick data annotation solution, look no further! Hasty is a vision AI platform that helps you throughout the ML lifecycle. To date, we can help you with:

  • Automating up to 90% of all automation;
  • Make quality control 35x faster;
  • Train models directly on your data using our low-code model builder;
  • Take any custom models trained in Hasty and deploy them back to the annotation environment in one click;
  • Export any models you create in commonly used formats;
  • Or host any model in our cloud;
  • Monitor inferences made in production;
  • Most importantly, we offer all this through an API for easy integration.

In short, we take care of a lot of the MLOps so you don’t have to. Please, book a demo if you want to know more.

Keep reading

Tuple helped us improve our ML workflow by 40%, which is fantastic. It reduced our overall investment by 90% to get high-quality annotations and an initial model.

Removing the risk from vision AI.

Only 13% of vision AI projects make it to production, with Hasty we boost that number to 100%.