If you have ever wondered how to evaluate an Object Detection algorithm, you might have heard of a well-known mean Average Precision (mAP) Machine Learning (ML) metric. As the name suggests, mAP is calculated by taking a mean value from Average Precision scores. So, to understand mAP, you must first understand the Average Precision concept. On this page, we will:
Сover the logic behind the Average Precision metric;
Find out how to interpret the metric’s value;
Calculate Average Precision on a simple example;
And see how to work with Average Precision using Python.
Let’s jump in.
Like the Area under the Precision-Recall curve (AUC-PR) metric, Average Precision is a way to summarize the PR curve into a single value. To define the term, the Average Precision metric (or just AP) is the weighted mean of Precision scores achieved at each PR curve threshold, with the increase in Recall from the previous threshold used as the weight.
Sure, such a definition might be tough to process. Still, everything will become accessible as soon as you look at the formula.
So, the general Average Precision calculation algorithm is as follows:
Get the predictions from your model, define the thresholds, and build a Precision-Recall curve (in the multiclass case, you can compute Micro or Macro Precision/Recall, for example);
Use a loop that goes through all Precision/Recall pairs;
Calculate the difference between the current and next Recall values (weight);
Multiply the weight by the current Precision value;
Repeat steps 2-4 for the next pair;
Summarize the obtained scores;
Analyze the Average Precision value.
Additionally, in real life, if you face a multiclass case, you might want to calculate the Average Precision score for each class separately. Such an approach will give you a better view of the algorithm’s performance as you will build PR curves for each category and understand whether your model is good at detecting specific class objects.
It is easy to understand the AP value itself. If you are getting more correct predictions, it leads to a better PR curve and, as a result, to higher Average Precision. The higher the metric value, the better. The best possible score is 1, and the worst is 0.
However, it is difficult to set any benchmarks in the Average Precision case because the thresholds might highly vary depending on the Machine Learning task, type of the case (binary/multiclass), etc. So, we suggest you dive deeper into your task and develop your benchmarking logic if you want to use Average Precision as an evaluation metric.
Let’s check out how to calculate Average Precision on a simple example. Imagine us having the following Precision/Recall pairs.
Precision |
Recall |
---|---|
0.5 |
1 |
0.7 |
0.6 |
0.75 |
0.5 |
0.9 |
0.3 |
1 |
0 |
Let’s start with calculating the weights:
1 - 0.6 = 0.4;
0.6 - 0.5 = 0.1;
0.5 - 0.3 = 0.2;
0.3 - 0 = 0.3.
Precision |
Recall |
Recall weights |
---|---|---|
0.5 |
1 |
- |
0.7 |
0.6 |
0.4 |
0.75 |
0.5 |
0.1 |
0.9 |
0.3 |
0.2 |
1 |
0 |
0.3 |
Now it is time to multiply the weights by the corresponding Precision values:
0.5 * 0.4 = 0.2;
0.7 * 0.1 = 0.07;
0.75 * 0.2 = 0.15;
0.9 * 0.3 = 0.27;
The final step is to summarize the obtained values:
Average Precision = 0.2 + 0.07 + 0.15 + 0.27 = 0.69
Average Precision as a standalone Machine Learning metric is not that popular in the industry. In real life, it is mostly used as a basis for a bit more complicated mean Average Precision metric. On this page, we decided to present one code block featuring working with the Average Precision in Python through the Scikit-learn (sklearn) library.
Hello, thank you for using the code provided by Hasty. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add to your system to complement it. If you have questions about using the tool, please get in touch with us to get direct help from the Hasty team.
# Importing the libraries and functions
from sklearn.metrics import average_precision_score
# Defining the arrays
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8]) # these are the predicted probabilities of an object to be of the class 1
# Calculating the result
average_precision_score(y_true, y_scores)
Only 13% of vision AI projects make it to production, with Hasty we boost that number to 100%.
Start for free Check out our services