NEW
CloudFactory launches Accelerated Annotation after acquiring Hasty.ai
26.07.2022 — Nursulu Sagimbayeva

How to speed up your Instance Segmentation task with Hasty

We gathered a dataset of 378 medieval paintings and labeled all the adults and children using the Hasty features. In just about 5 hours, we achieved the Mask mAP of 34.27 and a loss of 0.35 on the validation set. Check out the blog post to learn more about the Instance Segmentation pipeline in Hasty!

How to speed up your Instance Segmentation task with Hasty

If you have ever worked on a Data Science-related task, you know that having correct labels is crucial for your model’s performance. Today we want to show you how Hasty can speed up the annotation process and save hours of your life.

Let’s jump in!

Project workflow

For this post, we opted for an Instance Segmentation use-case. We gathered a dataset of 378 medieval paintings and decided to train a model that labels Adults and Children in these images.

The general workflow was as follows:

  1. Annotating a part of the data to train an Instance Segmentation AI assistant;
  2. Applying the Automated Labeling feature to label the majority of the data automatically;
  3. Using the AI Consensus Scoring feature to evaluate the created labels and fix them if necessary;
  4. Using Model Playground to train a model.

Data annotation

Elapsed time: 10 minutes

Once you have created a Hasty project, click on any image to begin. You will find various annotation instruments in the toolbar on the left.

For an Instance Segmentation task, you will likely use an AI-powered Instance Segmentation assistant (IS). Like any supervised learning-based model, it needs a bit of training before making its predictions. To unlock the IS tool, we manually labeled 10 pictures so the assistant could learn from them.

Instance Segmentation assistant

For the initial data annotation, Hasty offers advanced AI-powered tools – DEXTR, Atom, and Box-to-instance. Manual tools such as Polygon, Brush, and Bounding box are also available.

N.B.: After annotating each picture, you should change the image status from “New” to “Done” or “To review.” This signalizes the assistant that an image can be used in the training batch.

These statuses can be used interchangeably. In our case, we opted for “To review.”

After annotating 10 images, we trained the first model for the Instance Segmentation assistant.

The perk of using the IS assistant is that it retrains and improves continuously as we label more data. This makes the annotation process increasingly faster and easier.

Automated labeling

Elapsed time: 1 hour

Automated labeling is a Hasty feature that applies the IS assistant to all the images in your dataset with a single click. The function becomes available once the IS assistant is trained well and beats certain performance thresholds.

To unlock the Automated labeling feature, you should apply the IS tool to a couple of pictures and accept the suggested labels without editing.

Automated labeling feature

In our case, we unlocked the Automated labeling feature after labeling around 98 pictures. At this point, the IS assistant was trained well enough to produce accurate predictions that did not need editing. Then, we activated Automated labeling and took a break while the feature annotated the rest of the images.

Choose the IS assistant and start a run

As a result, 270 images were labeled automatically, which is more than 70% of the dataset. It took around 5 minutes. Annotating the same number of images manually would consume about 13,5 hours (given that labeling one image would take us 3 minutes, on average).

AI Consensus Scoring

Elapsed time: 2.5 hours (including waiting)

At this point, it is tempting to think that Automated labeling is the ultimate solution, and you can happily proceed further. However, it is always better to review your labels to check for potential mistakes.

AI Consensus Scoring is a Hasty feature that allows you to review your annotations. It automates the tedious task of manual data cleaning using SOTA AI approaches. The feature highlights the potential errors and suggests labels that you can accept or reject.

Potential errors detected by AI CS

To start the review, access the feature in the burger menu on the left, and create a new run.

AI Consensus Scoring

You need to specify:

Do not forget to toggle on the “Retrain model” flag to ensure that the models used to assess your labels are up to date and more likely to deliver top-notch results.

Create a new AI Consensus Scoring run

In our case, the majority of produced annotations were of high quality – especially when the following conditions were met:

Successful automated annotations

However, in some cases, mistakes slipped into the predictions. This is not surprising given that the assistant was trained on a limited number of pictures. Moreover, there was high variability across the dataset: some pictures portrayed a single person, while some – many; people were painted in different poses and styles, and so on.

Here are the most common errors we came across:

Correcting some errors took several seconds, while more complex ones took several minutes to edit. We changed the images’ status to “To review” once the annotations were adjusted to continuously improve the model.

The workflow at this step is the following:

  1. Run an AI Consensus Scoring review;
  2. Check the mistakes and change the status of the reviewed images to “Done” or “To review”;
  3. If you think the IS assistant has improved in the process, you can use the Automated labeling feature again. To do so, delete the old labels from the images you want to label automatically and change their status to “New”;
  4. Apply Automated labeling;
  5. Run an AI Consensus Scoring review once again;
  6. Repeat if needed.

Now it is time to customize the model configuration!

Model Playground

Elapsed time: 1.5 hours

Model Playground is a no-code solution offered by Hasty. It allows you to experiment with different neural network architectures and parameters. To access the feature, please use the burger menu on the left, as shown in the screenshot below.

Model Playground

First, we created a data split by clicking a button in the right corner of the screen.

Create a data split

In the split, you can specify various parameters of your future experiment, including the size of your train, test, and validation sets. In our case, we decided to include all the images in the split.

After creating the split, we scheduled an experiment.

Schedule an experiment

Fortunately, Hasty offers a ready-made template with all the parameters optimized for an Instance Segmentation task.

For sure, the preset parameters are not a universal optimum, so you can play around and adapt them for your needs. You can select and tune the model's architecture to the smallest detail, use image augmentation, set the desired metrics, training parameters, and many more.

Still, for us, the template was just fine. We did not change the default settings much – only increased the number of iterations from 1000 to 5000 and the size of the train batch from 1 to 8.

Training parameters

Once the parameters are set, you can start the experiment, make a cup of matcha tea, open a YouTube video (for example, our recent tutorial about Instance Segmentation), and wait for the results. Depending on the dataset's size and selected parameters, it can take from several minutes to hours and even days.

Results

Here are the results we managed to obtain.

Ideally, you should aim for the least possible loss – most notably, on a validation set. In our case, after around 500 iterations, the validation loss stopped at 0.35 and did not change much afterward, so we took 0.35 as a final number.

mean Average Precision

As for the metric, we used Mask mAP (mean Average Precision) to evaluate our model. The closer the metric value to 1 (or 100, if you count in percentages), the better your model predicts the instances. We achieved a mAP of 90.75 on the training and 34.27 on a validation set. This is a high discrepancy, and our model clearly overfits.

The result is not perfect, most likely due to the significant class imbalance and overall complexity of the images (low contrast messes up the model). In a real scenario, we could improve it by applying an oversampling of the Child class, adding and labeling more images with children, or using the Equalize augmentation. However, at this point, we decided to call it a day.

Summary

Overall time: 5 hours 10 minutes

We started off with a dataset of 378 pictures. We annotated 10 pictures and unlocked an AI-powered Instance Segmentation assistant.

After annotating 98 more images, we unlocked the Automated labeling feature, which automatically labeled the remaining 270 images.

Then we used the AI Consensus Scoring feature to review and adjust the labels suggested by the Automated labeling.

As a result, we got 851 labels in the dataset: 781 adults and 70 children. To train a model, we created a new split in Model Playground. 264 images (70%) were used as a training set, 76 images (20%) as a test set, and 38 (10%) as a validation set.

The last step was scheduling a Model Playground experiment with the Basic Instance Segmentor parameter template.

Due to a significant class imbalance – around 11:1 – the validation loss turned out pretty high: 0.35. The Mask mAP, conversely, was low: 34.27.

Visualization of the class imbalance

However, obtaining perfect results was not our primary goal. What we wanted to do here is to describe the general workflow for an Instance Segmentation task that you can implement with Hasty. As you see, with the help of AI-powered annotation, we managed to accomplish a lot in just a couple of hours :)

Shameless plug time

Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:

All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.

You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.

Keep reading

Fix your data bottleneck

80% of vision AI teams don’t make it to production because of bad or insufficient data. Hasty solves removes that risk.