NEW
CloudFactory has acquired Hasty.ai to offer a complete end-to-end Vision AI solution

2022.08.09

How to reduce labeling effort and increase automation in Object Detection

Vladimir Lyashenko

We built a model that detects whether construction workers in the images wear helmets or not. To do so, we used the Hard Hat Workers Object Detection Dataset. By activating AI assistants in Hasty, we achieved pretty impressive results in just about 12 hours. To see how exactly the workflow was implemented, check out this article.

Today, everyone is fascinated with the automation of Machine Learning workflows. Some people say it is possible to speed up the processes 10 times. The others insist that even 100x is possible. Unfortunately, very few back up their claims with some objective evidence.

However, it is not our case. In this post, we will squeeze every last drop of automation out of the tools Hasty has built. We will work with the public data from the construction sector, so anybody can quickly reproduce our results if needed.

Let's get into it!

Task review

As you might know, many workplace accidents occur because people ignore safety precautions in this and other sectors. However, with the fast development of Computer Vision algorithms, it becomes possible to detect such violations and do something about them before it is too late. For example, Object Detection algorithms can now identify whether a person wears a hard hat in workplace settings that require it.

For this post, we will build our own “Hard hat” detection model using all of Hasty. We are using the Hard Hat Workers Object Detection Dataset as our initial data.

Using the Hasty UI, we uploaded the dataset to the Hasty platform. The initial dataset had 5269 images in the train set and 1766 in the test set. Still, some failed to upload because of the network connection issues (36 train images and 9 test images), so we got 5233 train and 1757 test images.

In the dataset’s original annotations, there are three classes:

  • Head (with label color being yellow) - human’s head without a helmet;
    Example of the Head class
  • Helmet (with label color being pink) - human’s head in a helmet;
    Example of the Helmet class
  • Person (with label color being blue).

However, the Person class seems rudimentary as it has only 615 labels across the whole dataset despite many people being on each image. In comparison, the total number of annotations is 27 039. That is why we imported only Head and Helmet annotations for the train part of the project via the Hasty Import annotations feature.

We skimmed through the annotated images to ensure that the data was clean and the annotations were correct, and it seemed to be the case. So, we did not manually look for and fix artifacts before training ML models.

Testing the assistants

Hasty makes your annotation process easier by providing AI tools to help you with labeling. The assistants train/retrain whenever you reach a certain number of annotations in your project. In our case, we imported 19 758 annotations for 5233 train images, so the platform instantly triggered the assistants’ training.

Since we were working on the Object Detection vision AI task, Hasty trained Class Predictor and Object Detection assistants on the train set.

  • Class Predictor assistant specifies the class of an object you are labeling, speeding up the annotation process. We did not use it much throughout the work, but if we tried to draw some bounding boxes manually, it would back us up;
  • Object Detection assistant helps you find, create, and label bounding boxes in a specific image - in short, everything you need when preparing data for OD. This is the assistant we used when annotating the test set of the initial dataset.

Because of the vast amount of data we added to the first training batch (5233 images from the train set), it took from 30 to 45 minutes for each assistant to train.

When the assistants were ready, we tested them on the test images and were pretty much satisfied with the results. It appeared that the assistants worked well on the majority of the test set, even with the standard confidence of 30%. Moreover, in some cases, assistants produced perfect results under challenging circumstances.

Let’s check some examples.

Confidence 30

You can see the bounding boxes suggested by the assistants in the image above. As you might notice, they detected a hard hat in the far distance, which is an excellent result.

Confidence 30

Additionally, assistants performed well even if a human head or a hard hat was not 100% visible on an image (something overlapped an object or was cropped out of it).

Of course, not all the suggestions were perfect. For example, the assistants did not distinguish overlapping two hard hats of the same color and marked them as a single object. Also, they usually labeled a person wearing any headdress as wearing a hard hat. Sometimes, the suggestions were weird - assistants labeled human fists as a head or a hard hat, drew bounding boxes on a random wall, etc.

Confidence 30
Confidence 30

We tried to avoid such cases by adjusting the confidence modifier.

Confidence 30
Confidence 80

Still, this tactic does not always work perfectly because you might filter correct suggestions on some images.

Confidence 30
Confidence 80

So, when annotating the test set, we stuck with the 60 confidence but still had to sometimes either reject some of the suggestions or manually annotate missed objects. However, in general, the assistants worked well. Some of the suggestions were very good or perfect, whereas the others left more to be desired, but we minimized them by adjusting the confidence modifier.

With the help of AI assistants, we almost annotated the test set (only 44 images left). We spent 4 hours labeling 1713 images and made more than 6200 annotations. So we spent a bit more than 2 seconds per bounding box.

That’s decent already, but we know it’s not state-of-the-art in terms of performance. From our perspective, to further improve, we needed to clean up the data to give us better-performing automation. So next, we used our AI QA feature.

AI Consensus Scoring

AI Consensus Scoring is Hasty’s AI-powered quality assurance feature that uses AI to find potential errors in the dataset. AI CS makes checking the labels very convenient, so we used it on our task. In our case, we told AI CS to retrain the Object Detection model on the latest data first. That is why it took the algorithm some time (about 2.5 hours) to produce the results. AI CS checked 26 046 annotations across 6918 images and found 1177 potential errors in them.

AI Consensus Scoring presents the results of each run in a dashboard, so you can review the suggestions and fix the issues from one place. Also, Hasty allows you to filter the AI CS suggestions by an error type. In our case, AI CS found some Low IoU (low IoU between the original bounding box from the dataset’s annotations and the predicted one), Extra label, and Missing label errors. The most crucial errors were the Missing label for our task, so we filtered all the suggestions and reviewed the results.

AI Consensus Scoring Missing label example 1
AI Consensus Scoring Missing label example 2

As you see, with AI CS, you can find plenty of potential labels that were missed in the original dataset - and you do not have to check every image manually. Moreover, with a convenient AI CS UI that allows you to work with many suggestions on one page, you can swiftly skim through the suggestions and correct errors if needed. So, in about 3 minutes, we added over 120 new annotations. That is very fast for an Object Detection task.

As for the suggestions with the other error types, we also checked them. With Low IoU errors, AI CS suggested enlarging bounding boxes for some objects. We did not do it as most suggestions were related to the train part of the dataset, and we did not want to mess up the original annotations.

Additionally, AI CS found many Extra label errors that were not valuable for us as the annotations seemed fine. However, it is understandable why AI CS provided such suggestions. All the images with at least one Extra label error were of extremely low resolution, which explains AI models’ messy predictions.

To summarize, AI Consensus Scoring served its purpose, highlighted potential errors in the initial annotations, and found the missing labels - what else could you ask for?

Model Playground

At this point, we were sure that the annotations were correct, and we had as many images annotated as possible (within a single dataset). It was time to train a custom Object Detection model through Hasty’s https://help.hasty.ai/model-playground/data-split/creating-a-new-split caption: Model Playground. We took a model with basic parameters and used Hasty’s no-code solution to train it. The model had 4842 images for training, 1383 images for the test set, and 691 for validation.

It took the model an hour and seven minutes to train. We used the experiment’s https://help.hasty.ai/model-playground/model-exports/deploy caption: deploy feature to select the trained model as the corresponding AI assistant in the labeling tool when it was ready. Thus, we updated our Object Detection assistant and returned to the annotation environment.

Testing the updated assistant

The updated assistant worked better than the original one, producing perfect labels on 30 confidence for many difficult pictures with overlapping objects of different classes. So, it did not only detect all the objects correctly but also did not predict any strange bounding boxes with a confidence higher than 30.

Confidence 30
Confidence 30
Confidence 30
Confidence 30
Confidence 30
Confidence 30

The updated assistant also helped to unveil new edge cases. For example, the model detected a bald person as wearing a hard hat with high confidence.

Confidence 80

Needless to say, the updated assistant still made some mistakes and produced weird (and funny) artifacts as the data we added to the model did not directly address the edge cases. However, that is not what we can fix by using only one dataset. To further improve the solution, we should add more data addressing the edge cases.

Confidence 30. Here is one of the weirder edge cases. The shirt pattern confuses the model, and the blobs are identified as hats. We quickly fixed this by filtering at a higher confidence interval.

From the user’s perspective, the general performance of the updated assistant was good as its suggestions were almost perfect. To test the assistant further, we added random images from the Web that seemed similar to those in the initial Hard Hat dataset.

Confidence 30. Excellent work.
Source: https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQKm_hNXRhLgZtUWWSbCwTMIFTz7UXkociXhA&usqp=CAU
Confidence 30. Not so good, but this image differs from what the model has seen on training firefighters instead of construction workers.
Source: https://gdb.rferl.org/E06B9AA3-9908-44FB-A3E5-1DDDE02F0AB9_w1200_r1.jpg
Confidence 30. 10/10 - the assistant nailed it.
Source: https://cdnn1.img.sputniknews-uz.com/img/392/88/3928883_0:316:3079:2048_1920x0_80_0_0_59671e3967cc365f0c3ca1e732372e9b.jpg
Confidence 30. Quite remarkable, but there are some extra labels.
Source: https://www.newsler.ru/data/content/2020/92735/5ad7af1613baf12731cc6afb192c62a6.jpg
Confidence 30. Once again - perfect.
Source: https://cdnn1.img.sputniknews-uz.com/img/392/88/3928883_0:316:3079:2048_1920x0_80_0_0_59671e3967cc365f0c3ca1e732372e9b.jpg
Confidence 30. Quite a challenge, but the model worked pretty well. Still, some of the labels are missing.
Source: https://rcmm.ru/uploads/posts/2019-08/1566976208_novosibirsk-tragedija.jpg

As you see, the model did an excellent job on unseen data. Sure it did not work perfectly - very few models do when you put in data from a different source - but the result is very promising and indicates a model that generalizes well. With more data, a bit more QA, and a model we are not training on standard defaults, you should get a production-ready model in weeks.

Summary

Overall, using Hasty features on the Hard Hat Worker task was successful. Sure, there are edge cases assistants make mistakes on, but it is crucial to consider the number of images we used (a single dataset) and the time we spent on the task.

That we could develop a working solution that performs well on data from other sources in such a short time - about 12 hours from the first image imported to the last assistants’ test - we think showcases how fast you can build AI today if you use the right tools for the job.

From an annotation automation perspective, we took a default model that could already automate about 70-80% of the annotation work (because of the large initial dataset). We got it up to 90% in a couple of hours. Of course, this automation can be a massive time-saver with annotators able to annotate a complete image with the press of a button.

We also suspect that a trained ML engineer or Data Scientist could push the model we implemented and get way better results than the ones we have.

Thanks for reading, and happy training!

Shameless plug time

Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:

All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.

You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.

Keep reading

  • "Before discovering Hasty, labeling images was labor intensive, time-consuming, and less accurate. Hasty’s approach of training the model while labeling with faster annotate-test cycles has saved Audere countless hours."

    Jenny Abrahamson

    Software Engineer at Audere
  • "Hasty.ai helped us improve our ML workflow by 40%, which is fantastic. It reduced our overall investment by 90% to get high-quality annotations and an initial model."

    Dr. Alexander Roth

    Head of Engineering at Bayer Crop Sciences
  • "Modern tools like Hasty are very accessible for everyone at Element Six to harness the power of AI with a relatively low investment"

    Tanmay Rajpathak

    Applications engineer at Element Six
  • "Because of Hasty, PathSpot has been able to accelerate development of key features. Open communication and clear dialog with the team has allowed our engineers to focus. The rapid iteration and strong feedback loop mirrors our culture of a fast-moving technology company."

    Alex Barteau

    Senior Computer Engineer at PathSpot Technologies
  • "Hasty has taken our data labeling to the edge. Both semantic and bounding box labeling has gone from weeks or months on our large data sets to days. For QA, I just reviewed 19,000 labels in 5 hours. WTF!"

    Shane Prukop

    CEO at TruPart Manufacturing

Get to production reliably.

Hasty is a unified agile ML platform for your entire Vision AI pipeline — with minimal integration effort for you.