Classification / Tagging

Computer Vision (CV) is a scientific field that researches software systems trained to extract information from visual data, analyze it, and draw conclusions based on the analysis. The area consists of so-called CV or vision AI tasks. Each task is unique and incorporates techniques and heuristics for acquiring, processing, analyzing, understanding the data, and extracting various details from it. The most basic and well-known task is Classification. On this page, we will:

  • Understand the basics of the Classification / Tagging field in Machine Learning;

  • Cover in-depth the Classification / Tagging vision AI task;

  • Compare Binary, Multi-class, and Multi-label Classification;

  • Research the real-life applications of Classification / Tagging;

  • Cover some popular Classification / Tagging datasets and SOTA results on them;

  • See features that Hasty offers for streamlining a Classification / Tagging task.

Let’s jump in.

In everyday life, Classification refers to assigning a class label to a specific object. In a sense, it is the most common task every person does daily. For example, when you see a dog on the walk, you classify it as belonging to a specific breed. And this is how it works with any object you can see around you. So, Classification is one of the basic tasks our brain solves.

Fortunately, when it comes to Machine and Deep Learning (ML and DL), the Classification definition is not that different from the one we have just discussed. In the Artificial Intelligence field, Classification is a Supervised Learning task that focuses on assigning a class (label) to a given data asset.

Supervised learning is a learning approach that suggests training an ML algorithm on labeled data annotated for a specific output.

In ML and DL, you can classify many objects, the most popular being:

  • Images;

  • Videos;

  • Texts;

  • Documents;

  • Tabular data;

  • etc.

Anyway, the general Classification algorithm in all ML is as follows:

  1. You take some prelabeled data as training input for your model;

  2. You get the probability vector as a prediction output;

  3. You analyze the obtained vector based on standard heuristics or your own logic and formulate the final prediction.

The probability vector (for example, [0.2, 0.3, 0.5]) features values that can be interpreted as the probability that an image corresponds to a specific category. In other words, these values represent the model’s confidence that the given image belongs to a particular class.

In our case, the highest probability (0.5) is in the third slot, which means that the model thinks an object corresponds to class 3 with a high chance.
Source

Still, drawing a clear-cut difference between Classification in Machine and Deep Learning is essential. Fortunately, there is no difference in how we interpret Classification in these fields except for the data used for analysis and the methods used to address the challenge. In conventional Machine Learning, Classification is usually performed upon tabular data using standard ML algorithms such as Logistic Regression, Decision Tree, Random Forest, etc. On the other hand, in Deep Learning, researchers use neural networks to analyze more difficult data assets, for example, images.

With Hasty being a vision AI platform that works with visual data, we see Classification as a Deep Learning task.

Additionally, in Deep Learning, Classification might be a part of more challenging tasks such as Instance Segmentation, Semantic Segmentation, Object Detection, etc. However, sometimes researchers solve Classification as a standalone task.

Classification Vs. Tagging

In some academic papers, you might come across such a term as Tagging. It is usually used in the same environment as Classification, so many people think they are synonyms and can be used interchangeably. This is wrong. Let’s put everything in its place.

The Classification task can be divided into two major parts:

  • Single-label Classification - you assign a single class to your object (for example, an image);

    • Binary Classification;

    • Multi-class Classification;

  • Multi-label Classification - you assign multiple classes to your object.

Multi-label Classification is often referred to as Image Tagging or Attribute Prediction. This is where the confusion comes from. So, in a very particular case, when you speak about Multi-label Classification, you can use the Classification and Tagging terms interchangeably.

Let’s take a closer look at each variation of the Classification task.

Binary Classification is such a Classification that operates with two classes only. Usually, these class labels are mapped to the values 0 and 1. For instance, the class True corresponds to 1 and False to 0. So the ultimate goal is to predict one of two classes.

The general Binary Classification algorithm in all ML is as follows:

  1. You take some prelabeled data as input for your model;

  2. You get the probability vector as an output (for example, [0.3, 0.7]);

  3. You analyze the obtained vector based on standard heuristics or your own logic and formulate the final prediction. In our case, the highest probability (0.7) is in the second slot, which means that the model thinks an object corresponds to class 2 with a high chance.

Binary Classification example
Source

In a real-life scenario, we regularly come across Binary Classification problems. Some good examples of these might be:

  • Is the patient healthy or sick?

  • Is the received email spam or not?

  • Is the answer to the question yes or no?

  • Should I choose this option or another one?

As for the algorithms you can use to solve a Binary Classification task, it depends on whether you aim to solve a Machine or Deep Learning Classification problem:

  • Machine Learning Binary Classification algorithms;

    • Logistic Regression;

    • Support Vector Machine (SVM);

    • k-Nearest Neighbors;

    • Decision Tree;

    • Random Forest;

    • etc.

  • Deep Learning Binary Classification algorithms - basically, any convolutional neural network architecture;

    • ResNet;

    • MobileNet (all versions);

    • EfficientNet;

    • SWIN;

    • ConvNeXt;

    • ResNeXt;

    • etc.

Multi-class Classification is such a Classification that operates with more than two classes. The number of labels can vary from 3 to large quantities. For instance, one can build a model that classifies letters of the alphabet or one of the thousands of items at a groceries store.

The key difference between Binary and Multi-class Classification is the number of classes. The Binary Classification models operate with two classes only. In contrast, there can be more than two classes in Multi-class Classification tasks.

The general Multi-class Classification algorithm in all ML is as follows:

  1. You take some prelabeled data as input for your model;

  2. You get the probability vector as an output (for example, [0.1, 0.2, 0.7]);

  3. You analyze the obtained vector based on standard heuristics or your own logic and formulate the final prediction. In our case, the highest probability (0.7) is in the third slot, which means that the model thinks an object corresponds to class 3 with a high chance.

Binary Vs. Multi-class Classification
Source

Like in the Binary case, we regularly encounter Multi-class Classification problems in real-life scenarios. Some good examples of these might be:

  • Classifying animals by their species;

  • Classifying clients by their behavior;

  • Sentiment analysis of a sentence (sad, happy, neutral, etc.).

As for the algorithms you can use to solve a Multi-class Classification task, it depends on whether you aim to solve a Machine or Deep Learning Classification problem:

  • Machine Learning Multi-class Classification algorithms;
    • Logistic Regression;
    • Support Vector Machine (SVM);
    • k-Nearest Neighbors;
    • Decision Tree;
    • Random Forest;
    • etc.
  • Deep Learning Multi-class Classification algorithms - basically, any convolutional neural network architecture;
    • ResNet;
    • MobileNet (all versions);
    • EfficientNet;
    • SWIN;
    • ConvNeXt;
    • ResNeXt;
    • etc.

Multi-label Classification is such a Classification that can operate with more than two classes and allows you to assign more than one label to an image.

The general Multi-label Classification algorithm in all ML is as follows:

  1. You take some prelabeled data as input for your model;

  2. You get the probability vector as an output (for example, [0.1, 0.35, 0.7]);

  3. You analyze the obtained vector based on standard heuristics or your own logic and formulate the final prediction. For example, you can set a certain threshold which you can use to decide whether to assign a class to an image or not based on predicted probability. Let’s say that, in our case, the threshold is 0.3. As you can see, two values are above the threshold so that we can assign classes 2 and 3 to an image based on our logic.

Multi-label classification example
Source

Widespread examples of Multi-label Classification are when an object can simultaneously be assigned to many classes. A good example might be classifying movies by their genre.

As for the algorithms you can use to solve a Multi-label Classification task, it depends on whether you aim to solve a Machine or Deep Learning Classification problem:

  • Machine Learning Multi-label Classification algorithms - many standard ML algorithms support Multi-label cases;
    • k-Nearest Neighbors;
    • Decision Tree;
    • Random Forest;
    • Ridge;
    • etc.
  • Deep Learning Multi-label Classification algorithms - basically, any convolutional neural network architecture with some logic built upon its output;
    • ResNet;
    • MobileNet (all versions);
    • EfficientNet;
    • SWIN;
    • ConvNeXt;
    • ResNeXt;
    • etc.

Idea

Input

Output

Binary Classification

To predict whether the input falls or does not fall into a certain category. Operates with two classes only.

Table data/text data/image/etc.

Examples:

  • a table with school grades;

  • an email;

  • a photo of a cat, etc.

Probability vector. The goal is to pick one of two labels

Examples:

  • accepted/rejected;

  • spam/not spam;

  • cat/dog, and so on.

Multi-class Classification

To predict the most probable class of the input out of many.

Table data/text data/image/etc.

Examples:

  • an animal;

  • an item from the groceries store;

  • a table with age and income data, etc.

Probability vector. The goal is to pick one class out of multiple labels.

Examples:

  • cat/dog/unicorn;

  • apple/banana/corn;

  • single/divorced/married, etc.

Multi-label Classification

To predict all the classes the input might be assigned to. Can operates with many classes (more than two)

Table data/text data/image/etc.

Example:

  • a picture with several objects on it.

Probability vector. The goal is to pick one or several classes out of multiple labels.

Example:

  • both dog and a plant found in the picture.

Classification is one of the most basic tasks in all Machine and Deep Learning which is widely used both as a standalone challenge and as a part of more challenging tasks. Some popular applications of a self-contained Classification task include:

  • Agricultural challenges (for example, classifying crops as damaged or healthy);
  • Handwritten digit recognition (from 0 to 9);
Source
  • Categorizing emotions on a human’s face;
  • Identifying whether the person is a child or an adult;
  • Classifying the patient’s state of health by an image (medical image processing);
  • And many more use cases.

You can find many free datasets that can be used to solve all sorts of Deep Learning Classification tasks on the Web. The most popular ones are:

  • ImageNet - a large visual database with more than 14 million images;
Source
  • CIFAR-10 - a dataset comprising 60000 32x32 color images in 10 classes, with 6000 images per class;
Source
  • CIFAR-100;
  • MNIST (Modified National Institute of Standards and Technology) database - an extensive collection of handwritten digits;
Source
Source

As you might know, data annotation might be a bottleneck for AI startups as the conventional labeling approach is both costly and time-consuming. Hasty’s data-centric ML platform addresses the pain and automates 90% of the work needed to build and optimize your dataset for the most​ advanced use cases ​ with our self-learning assistants using AI to train AI.

The primary focus of Hasty is the vision AI field. Therefore, Hasty is a perfect Classification annotation tool as it implements all the necessary instruments to help you with your Classification task.

Let’s go through the available options step-by-step. To streamline your Classification annotation experience, Hasty offers:

As for the annotation quality control process, Hasty has you covered with its AI Consensus Scoring feature that has a separate Class review option. With the help of AI CS, you can find misclassified labels. Also, you will better understand how a machine sees your data, which might be valuable for your annotation strategy.

When it comes to model building, Hasty’s Model Playground supports many modern neural network architectures. For Classification, these include:

As a Machine Learning metric for the Classification / Tagging case, Hasty uses:

As of today, these are the key options Hasty has for the Classification / Tagging cases. If you want a more detailed overview, please check out the further resources or book a demo to get deeper into Hasty with our help.

Last updated on Dec 19, 2022

Removing the risk from vision AI.

Only 13% of vision AI projects make it to production, with Hasty we boost that number to 100%.

Start for free Request a demo