ROC curve and PR curve

Table of Contents

1. ROC curve

1. Concept

2. Drawing of ROC curve

2. PR curve

1. Concept

2.AP (area under the PR curve)

3. Drawing of PR curve

3. Advantages and disadvantages of ROC curve and PR curve

1.ROC curve

2.PR curve

4. Experiment summary

1. ROC curve

1. Concept

The ROC (Receiver Operating Characteristic) curve is an important tool for evaluating binary classification models. It can help us choose the best classification threshold to achieve the best classification results in different situations. The horizontal axis of the ROC curve is the false positive rate (FPR), and the vertical axis is the true positive rate (TPR), where:

True positive rate (TPR) = the number of positive examples predicted as positive examples/the number of actual positive examples, i.e. $TPR=\frac{TP}{TP + FN}$

False positive rate (FPR) = the number of negative examples predicted as positive examples/the number of actual negative examples, i.e. $FPR=\frac{FP}{FP + TN}$

In the ROC curve, each point represents the FPR and TPR of the model under different classification thresholds. The closer the point on the curve is to the upper left corner, the better the model performs under various classification thresholds. If the ROC curve of the model is completely on the diagonal, it means that the model cannot distinguish between positive and negative examples.

A value called AUC (Area Under the Curve) can be calculated based on the ROC curve. AUC is the area under the ROC curve. The closer the curve is to the upper left corner, which means TPR>FPR, the better the overall performance of the model. The AUC value can be used to measure the classification ability of the model. The bigger the better, and ideally the AUC value is 1.

2. Drawing of ROC curve

Drawing the ROC curve requires calculating the true positive rate (TPR) and false positive rate (FPR) under different classification thresholds. The following is a commonly used method of drawing ROC curves:

Calculate the TPR and FPR of the model under different classification thresholds. First, the prediction results are converted into binary classification prediction labels according to the classification threshold, and then the TPR and FPR under each classification threshold are calculated.
Sort the TPRs in ascending order of FPR.
Draw the ROC curve. Take the sorted TPR as the vertical axis and the corresponding FPR as the horizontal axis, and draw a connecting line.

Here is a simple example code showing how to plot an ROC curve using the scikit-learn library in Python:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc

# Assume there are probability values predicted by the model and actual labels
y_true = np.array([0, 0, 1, 0, 1])
y_scores = np.array([0.1, 0.3, 0.4, 0.6, 0.8])

# Calculate points on the ROC curve
fpr, tpr, thresholds = roc_curve(y_true, y_scores)

# Calculate AUC value
roc_auc = auc(fpr, tpr)

# Draw ROC curve
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--') # Diagonal dashed line
plt.xlim([0.0, 1.0]) #Set the range of the horizontal and vertical axes
plt.ylim([0.0, 1.05]) #Same as above
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC)')
plt.legend(loc="lower right")
plt.show()

(1) First import the required library. In this example, y_true represents the actual label, and y_scores represents the probability value predicted by the model.

(2) Use the roc_curve function to calculate points on the ROC curve:

roc_curve(y_true, y_scores): Pass in the actual label and predicted probability value, calculate the true positive rate (TPR) and false positive rate (FPR) under different classification thresholds, and return three arrays:

fpr: TPR values sorted in ascending order of FPR.

tpr: Corresponds to the sorted TPR value.

thresholds: used to calculate the classification threshold for each point on the ROC curve.

(3) Use the auc function to calculate the AUC value:

auc(fpr,tpr): Pass in the FPR and TPR arrays and calculate the area under the ROC curve.

Operation result:

In the ROC curve chart, the points closer to (0, 1) correspond to better classification performance of the model.

2. PR Curve

1. Concept

The PR curve (Precision-Recall Curve) is a curve used to evaluate the performance of a classification model. In binary classification problems, we usually focus on two indicators: Precision and Recall.

Presession precision rate (precision rate) is defined as the proportion of samples that are predicted to be positive examples that are actually positive examples. That is, $Presession=\frac{TP}{TP + FP}$ , where TP represents True Positive and FP represents False Positive.

Recall rate (recall rate) is defined as the proportion of predicted positive examples among all samples that are actually positive examples. That is, $Recall=\frac{TP}{TP + FN}$ , where FN represents False Negative.

The PR curve is a curve plotted with recall rate on the horizontal axis and precision rate on the vertical axis. Each point on the curve represents the precision and recall of the model under different classification thresholds.

PR curves can be used to visually evaluate the performance of the model under different classification thresholds. The closer the curve is to the upper right corner, the better the model’s performance. For the PR curve, the larger area indicates that the model has better classification performance.

Similar to the ROC curve, the PR curve can help us compare between different models and choose an appropriate threshold to balance the needs of precision and recall.

2.AP (area under the PR curve)

Different from TPR and FPR, the PR relationship is a trade-off relationship, but often we hope that the higher the better for both, so the PR curve is convex to the right, the better (there are exceptions, there are For example, in a risk scenario, when the prediction is 1 and the actual value is 0, compensation needs to be paid. Recall will generally be required to be close to 100%, and Precision can be lost). Therefore, except for special circumstances, the Precision-recall curve is usually used to find the trade-off between Precision and Recall of the classifier.

AP is the area under the Precision-recall curve. Generally speaking, a better classifier has a higher AP value.

There is also the concept of mAP, which is the average of multiple categories of AP. The meaning of this mean is to average the AP of each class, and the result is the value of mAP. The size of mAP must be in the [0,1] interval, and the bigger the better. This indicator is the most important one in target detection algorithms.

3. Drawing of PR curve

We know that when an algorithm classifies a sample, it will have a confidence level, which means the probability that the sample is a positive example. Then by selecting an appropriate threshold, the sample probability is divided. For example, when the threshold is 50%, it means that the confidence level is greater than 50%, then it is a positive example, otherwise it is a counterexample.

In the same way as drawing a graph, we need to generate random probabilities to represent the probability that each sample example is a positive example, and then sort these probabilities from large to small, and then select the threshold for each sample in this order. Those greater than the threshold The examples of probability are positive examples, and all the following are negative examples. We use the probability of each sample in the data as a threshold, and then get the corresponding recall rate and precision rate. In this way, we can draw a graph based on a lot of data and give the following code:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve, average_precision_score

# Assume there are probability values predicted by the model and actual labels
y_true = np.array([0, 1, 1, 0, 1])
y_scores = np.array([0.1, 0.3, 0.4, 0.6, 0.8])

# Calculate points on the PR curve
precision, recall, thresholds = precision_recall_curve(y_true, y_scores)

# Calculate average precision (Average Precision)
average_precision = average_precision_score(y_true, y_scores)

# Draw PR curve
plt.plot(recall, precision, label='PR curve (AP = %0.2f)' % average_precision)
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.ylim([0.0, 1.05])
plt.xlim([0.0, 1.0])
plt.title('Precision-Recall Curve')
plt.legend(loc="lower left")
plt.show()

(1) Use the precision_recall_curve function to calculate points on the PR curve:

precision_recall_curve(y_true, y_scores): Pass in the actual label and predicted probability value, calculate the precision rate and recall rate under different classification thresholds, and return three arrays:

precision: The precision rate after sorting the thresholds from small to large.

recall: Corresponds to the recall rate after sorting.

thresholds: used to calculate the classification threshold for each point on the PR curve.

(2) Use the average_precision_score function to calculate the average accuracy (AP):

average_precision_score(y_true, y_scores): Pass in the actual label and predicted probability value to calculate the average accuracy.

(3) Draw PR curve

operation result:

3. Advantages and Disadvantages of ROC Curve and PR Curve

1.ROC curve

(1) Advantages:

Consider the balance between positive and negative examples. Because TPR focuses on positive examples and FPR focuses on negative examples, it becomes a more balanced evaluation method. Suitable for evaluating the overall performance of a classifier.

The two indicators of the ROC curve, the denominator of TPR is all positive examples, and the denominator of FPR is all negative examples, so they do not depend on the specific category distribution.

(2) Disadvantages:

In the context of class imbalance, when the number of negative examples N far exceeds the number of positive examples P, a substantial increase in FP can only result in an insignificant increase in FPR, causing the ROC curve to present an overly optimistic effect estimate. This is less acceptable if the main concern is the prediction accuracy of positive examples.

2.PR curve

(1) Advantages:

Both indicators of the PR curve focus on positive examples. In the class imbalance problem, since positive examples are mainly concerned, the PR curve is widely considered to be better than the ROC curve in this case.

(2) Disadvantages:

Only focus on positive examples, not negative examples

4. Experiment summary

The ROC curve is suitable for evaluating the balancing ability of a classification model under different classification thresholds. When the number of samples in each category is uneven, the ROC curve can reflect the performance of the model on different types of samples.
The PR curve is suitable for evaluating the performance of the model in positive samples. For example, in the fields of information retrieval, recommendation systems, etc., the focus is on how many of the samples predicted to be positive are true positives.
The shapes and performance evaluation metrics of ROC curves and PR curves are related, but not exactly the same. When comparing different models, you can choose to use the ROC curve or the PR curve according to actual needs.
Average Precision is the area under the PR curve divided by the area under ideal conditions, and can be used as an indicator to comprehensively evaluate model performance.
In the experiment, the steps for drawing ROC curves and PR curves are roughly the same. They use real labels and predicted probability values to calculate the corresponding points and draw the curves. In comparison, the PR curve pays more attention to the classification effect of positive samples.

In short, ROC curves and PR curves are commonly used tools to evaluate the performance of classification models. Select the appropriate curve for analysis and comparison based on actual needs.