Variance measures the average value of the squared deviation from the mean. The instance with the lowest Variance is considered the most informative. Therefore, labeling this instance can be very useful to our model.

  • If the Variance is 0, then the model doesn’t have a clue about the correct label. 
  • If the Variance is 1, then the model has a clear “belief” about the correct label.
Source

Imagine we have only 2 instances – A and B –, and the model has to decide which of them to suggest for annotation. It has made the following class predictions:

  • Instance A: “cat” – 0.5, “milkshake” – 0.45, “cloud” – 0.05.
  • Instance B: “cat” – 0.4, “milkshake” – 0.3, “cloud” – 0.3.

In this case, our model will choose instance B over A, as 0.0067 is less than 0.0406:

Learn more about the other heuristics:

Boost model performance quickly with AI-powered labeling and 100% QA.

Learn more
Last modified