Today, everyone is fascinated with the automation of Machine Learning workflows. Some people say it is possible to speed up the processes 10 times. The others insist that even 100x is possible. Unfortunately, very few back up their claims with some objective evidence.
However, it is not our case. In this post, we will squeeze every last drop of automation out of the tools Hasty has built. We will work with the public data from the construction sector, so anybody can quickly reproduce our results if needed.
Let's get into it!
As you might know, many workplace accidents occur because people ignore safety precautions in this and other sectors. However, with the fast development of Computer Vision algorithms, it becomes possible to detect such violations and do something about them before it is too late. For example, Object Detection algorithms can now identify whether a person wears a hard hat in workplace settings that require it.
For this post, we will build our own “Hard hat” detection model using all of Hasty. We are using the Hard Hat Workers Object Detection Dataset as our initial data.
Using the Hasty UI, we uploaded the dataset to the Hasty platform. The initial dataset had 5269 images in the train set and 1766 in the test set. Still, some failed to upload because of the network connection issues (36 train images and 9 test images), so we got 5233 train and 1757 test images.
In the dataset’s original annotations, there are three classes:
- Head (with label color being yellow) - human’s head without a helmet;
- Helmet (with label color being pink) - human’s head in a helmet;
- Person (with label color being blue).
However, the Person class seems rudimentary as it has only 615 labels across the whole dataset despite many people being on each image. In comparison, the total number of annotations is 27 039. That is why we imported only Head and Helmet annotations for the train part of the project via the Hasty Import annotations feature.
We skimmed through the annotated images to ensure that the data was clean and the annotations were correct, and it seemed to be the case. So, we did not manually look for and fix artifacts before training ML models.
Testing the assistants
Hasty makes your annotation process easier by providing AI tools to help you with labeling. The assistants train/retrain whenever you reach a certain number of annotations in your project. In our case, we imported 19 758 annotations for 5233 train images, so the platform instantly triggered the assistants’ training.
- Class Predictor assistant specifies the class of an object you are labeling, speeding up the annotation process. We did not use it much throughout the work, but if we tried to draw some bounding boxes manually, it would back us up;
- Object Detection assistant helps you find, create, and label bounding boxes in a specific image - in short, everything you need when preparing data for OD. This is the assistant we used when annotating the test set of the initial dataset.
Because of the vast amount of data we added to the first training batch (5233 images from the train set), it took from 30 to 45 minutes for each assistant to train.
When the assistants were ready, we tested them on the test images and were pretty much satisfied with the results. It appeared that the assistants worked well on the majority of the test set, even with the standard confidence of 30%. Moreover, in some cases, assistants produced perfect results under challenging circumstances.
Let’s check some examples.
You can see the bounding boxes suggested by the assistants in the image above. As you might notice, they detected a hard hat in the far distance, which is an excellent result.
Additionally, assistants performed well even if a human head or a hard hat was not 100% visible on an image (something overlapped an object or was cropped out of it).
Of course, not all the suggestions were perfect. For example, the assistants did not distinguish overlapping two hard hats of the same color and marked them as a single object. Also, they usually labeled a person wearing any headdress as wearing a hard hat. Sometimes, the suggestions were weird - assistants labeled human fists as a head or a hard hat, drew bounding boxes on a random wall, etc.
We tried to avoid such cases by adjusting the confidence modifier.
Still, this tactic does not always work perfectly because you might filter correct suggestions on some images.
So, when annotating the test set, we stuck with the 60 confidence but still had to sometimes either reject some of the suggestions or manually annotate missed objects. However, in general, the assistants worked well. Some of the suggestions were very good or perfect, whereas the others left more to be desired, but we minimized them by adjusting the confidence modifier.
With the help of AI assistants, we almost annotated the test set (only 44 images left). We spent 4 hours labeling 1713 images and made more than 6200 annotations. So we spent a bit more than 2 seconds per bounding box.
That’s decent already, but we know it’s not state-of-the-art in terms of performance. From our perspective, to further improve, we needed to clean up the data to give us better-performing automation. So next, we used our AI QA feature.
AI Consensus Scoring
AI Consensus Scoring is Hasty’s AI-powered quality assurance feature that uses AI to find potential errors in the dataset. AI CS makes checking the labels very convenient, so we used it on our task. In our case, we told AI CS to retrain the Object Detection model on the latest data first. That is why it took the algorithm some time (about 2.5 hours) to produce the results. AI CS checked 26 046 annotations across 6918 images and found 1177 potential errors in them.
AI Consensus Scoring presents the results of each run in a dashboard, so you can review the suggestions and fix the issues from one place. Also, Hasty allows you to filter the AI CS suggestions by an error type. In our case, AI CS found some Low IoU (low IoU between the original bounding box from the dataset’s annotations and the predicted one), Extra label, and Missing label errors. The most crucial errors were the Missing label for our task, so we filtered all the suggestions and reviewed the results.
As you see, with AI CS, you can find plenty of potential labels that were missed in the original dataset - and you do not have to check every image manually. Moreover, with a convenient AI CS UI that allows you to work with many suggestions on one page, you can swiftly skim through the suggestions and correct errors if needed. So, in about 3 minutes, we added over 120 new annotations. That is very fast for an Object Detection task.
As for the suggestions with the other error types, we also checked them. With Low IoU errors, AI CS suggested enlarging bounding boxes for some objects. We did not do it as most suggestions were related to the train part of the dataset, and we did not want to mess up the original annotations.
Additionally, AI CS found many Extra label errors that were not valuable for us as the annotations seemed fine. However, it is understandable why AI CS provided such suggestions. All the images with at least one Extra label error were of extremely low resolution, which explains AI models’ messy predictions.
To summarize, AI Consensus Scoring served its purpose, highlighted potential errors in the initial annotations, and found the missing labels - what else could you ask for?
At this point, we were sure that the annotations were correct, and we had as many images annotated as possible (within a single dataset). It was time to train a custom Object Detection model through Hasty’s https://help.hasty.ai/model-playground/data-split/creating-a-new-split caption: Model Playground. We took a model with basic parameters and used Hasty’s no-code solution to train it. The model had 4842 images for training, 1383 images for the test set, and 691 for validation.
It took the model an hour and seven minutes to train. We used the experiment’s https://help.hasty.ai/model-playground/model-exports/deploy caption: deploy feature to select the trained model as the corresponding AI assistant in the labeling tool when it was ready. Thus, we updated our Object Detection assistant and returned to the annotation environment.
Testing the updated assistant
The updated assistant worked better than the original one, producing perfect labels on 30 confidence for many difficult pictures with overlapping objects of different classes. So, it did not only detect all the objects correctly but also did not predict any strange bounding boxes with a confidence higher than 30.
The updated assistant also helped to unveil new edge cases. For example, the model detected a bald person as wearing a hard hat with high confidence.
Needless to say, the updated assistant still made some mistakes and produced weird (and funny) artifacts as the data we added to the model did not directly address the edge cases. However, that is not what we can fix by using only one dataset. To further improve the solution, we should add more data addressing the edge cases.
From the user’s perspective, the general performance of the updated assistant was good as its suggestions were almost perfect. To test the assistant further, we added random images from the Web that seemed similar to those in the initial Hard Hat dataset.
As you see, the model did an excellent job on unseen data. Sure it did not work perfectly - very few models do when you put in data from a different source - but the result is very promising and indicates a model that generalizes well. With more data, a bit more QA, and a model we are not training on standard defaults, you should get a production-ready model in weeks.
Overall, using Hasty features on the Hard Hat Worker task was successful. Sure, there are edge cases assistants make mistakes on, but it is crucial to consider the number of images we used (a single dataset) and the time we spent on the task.
That we could develop a working solution that performs well on data from other sources in such a short time - about 12 hours from the first image imported to the last assistants’ test - we think showcases how fast you can build AI today if you use the right tools for the job.
From an annotation automation perspective, we took a default model that could already automate about 70-80% of the annotation work (because of the large initial dataset). We got it up to 90% in a couple of hours. Of course, this automation can be a massive time-saver with annotators able to annotate a complete image with the press of a button.
We also suspect that a trained ML engineer or Data Scientist could push the model we implemented and get way better results than the ones we have.
Thanks for reading, and happy training!
Shameless plug time
Only 13% of vision AI projects make it to production. With Hasty, we boost that number to 100%.
Our comprehensive vision AI platform is the only one you need to go from raw data to a production-ready model. We can help you with:
- Labeling 10x faster with our AI Assistants.
- Automating quality control, making it 35x faster, with our AI Consensus Scoring feature.
- Train models in our no-code Model Playground, which can then be used to improve labeling and QA automation even further.
- All while keeping you in control and your data safe.
All the data and models you create always belong to you and can be exported and used outside of Hasty at any given time entirely for free.
You can try Hasty by signing up for free here. If you are looking for additional services like help with ML engineering, we also offer that. Check out our service offerings here to learn more about how we can help.