How to use Active Learning in Hasty

Active Learning is an iterative algorithm that detects the most informative images in the project and suggests them for labeling first. The more uncertain the model is about the image's labels, the more informative it is considered, and therefore, the higher rank it receives. Labeling the high-ranking images first might notably increase the model’s performance and lead to faster model development.

To get started with Active Learning, please ensure you created your Hasty project and uploaded the images you want to annotate.

You can use the Active Learning feature in two possible ways: Auto-generated ranks and Ranks on demand. We will talk about both cases in detail below.

Useful glossary for this page:

Ranks - ordinal positions assigned to different images in the dataset based on their informativeness score. Images, where the model is most certain about its predictions, get the lowest ranks, whereas images, where the model is most uncertain, get the highest ranks.

Heuristics - in this context: functions used to rank the images by their informativeness. In a broader context: techniques that help address the problem quickly and efficiently when classical approaches are too slow or bulky. Although heuristics trade off accuracy for speed, the results achieved with them are usually good enough and can be used instead of searching for the perfect solution.

Auto-generated ranks are a default Hasty feature that automatically suggests you the best images for labeling. You do not need to enable it since Auto-generated ranks are already available when you create the project.

The feature is linked to the following AI-assistants:

This means that, as you start annotating the images and hence training the corresponding AI-assistants, the feature will automatically generate image ranks for labeling.

The ranks are updated every time the assistant is trained. Therefore, the suggestions made by the Auto-generated ranks feature will improve as you label more images and train the AI-assistants. When the image ranks change, you will receive a corresponding notification.

The default heuristic for each family model is Margin. However, you can easily change it and select different heuristics for each model. Currently, the heuristics provided by Hasty include:

You can also select the “None” option to not use any heuristics for specific model families. In this case, you will not see image ranks for the respective model family unless you generate Ranks on demand. Instead, you will see the default alphabetical sorting of the images in your project.

To tweak Auto-generated ranks parameters, please go to the Active Learning settings in the lower left corner.

The alternative to Auto-generated ranks is the Ranks on demand feature. It allows you to generate a certain amount of image ranks whenever needed, not automatically. For example, you could use this feature to label 100 images from your project.

This feature allows you to customize the Active Learning process by giving you control over the number of images you want to rank and the sources of these images (selected datasets).

To access it, you should go to Active Learning settings in the lower left corner and manually select it.

Press the New rank button to generate a rank on demand. For each run, you will need to specify:

  • Its name;
  • The model family;
  • The heuristic you want to use;
  • The number of images to be ranked;
  • The datasets from which you want to take images for ranking.

After you create a run, you can view it in the Active Learning tab of the Project Dashboard. You will receive a notification when the run is completed.

Note that only one run can be processed at a time. To start a new run, please wait until the previous run is over.

What is the difference between the Auto-generated ranks and Ranks on demand?

  • The Auto-generated ranks are produced each time a model trains or retrains. As your AI-assistants updates, your image ranking improves as well. 
  • The Ranks on demand, or user-generated ranks, are produced only if you create a run and specify how many images you want to rank.

When might Ranks on demand be preferable to use?

In general, no option is inherently better than another. The Auto-generated ranks are convenient since they do not require much supervision and are updated automatically when your AI-assistant updates.

However, you might opt for the Ranks on demand when:

  • You want to receive rank suggestions as soon as possible without waiting for the next AI-assistant retrain. This might be useful if you have to do a lot of labeling before the next retrain;
  • You want to try out another heuristic and see how it performs on a certain amount of images;
  • You are deploying a Model Playground model.

Apart from using Active Learning heuristics directly, you can go for other options:

  • Random sorting algorithm suggests the images for labeling in a random fashion. This is also a valid way to increase the efficiency of your labeling. To learn more about how Random sampling works, check out our Active Learning documentation page.
  • No rank (default sorting) does not assign any ranks to the images. It simply sorts the data in the default alphabetical order.

To switch to one of these options, go to the Active Learning settings in the lower left corner.

Speaking about the further steps, you have various options. For example, you can use Automated labeling to annotate the suggested images automatically and then run AI Consensus Scoring to review the existing labels.

To learn more about the general Active Learning concept, please check out our article

If you have questions or suggestions, please don't hesitate to contact us. Happy training!

Last updated on Dec 16, 2022

Removing the risk from vision AI.

Only 13% of vision AI projects make it to production, with Hasty we boost that number to 100%.

Start for free Check out our services