ROC curve and PR curve

Table of Contents

1.PR, ROC curve concept

2. Confusion matrix

3.The difference between ROC curve and PR curve

4. Draw PR curve and ROC curve

5. Summary


1.PR, ROC curve concept

PR curves and ROC curves are commonly used tools when evaluating binary classification models (classifiers) to show the performance and effectiveness of the model. Here is a brief introduction to their concepts:

PR Curve (Precision-Recall Curve): It is a curve used to evaluate the performance of a two-classification model. The abscissa is recall (Recall) and the ordinate is precision (Precision). Precision refers to the proportion of samples predicted as positive samples by the model that are actually positive samples, and recall rate refers to the proportion of samples that are actually positive samples that are correctly predicted as positive samples by the model. By plotting the precision and recall rates under different thresholds, the PR curve can be obtained.

ROC Curve (Receiver Operating Characteristic Curve): It is a curve used to evaluate the performance of a two-classification model. The abscissa is FPR (False Positive Rate) and the ordinate is TPR (True Positive Rate). TPR refers to the proportion of samples that are actually positive samples that are correctly predicted as positive samples by the model, and FPR refers to the proportion of samples that are actually negative samples that are incorrectly predicted as positive samples by the model. By plotting FPR and TPR under different thresholds, the ROC curve can be obtained.

Compared with the ROC curve, the PR curve pays more attention to the recognition rate of the minority class (i.e., positive examples) by the model. Therefore, the PR curve is more suitable when the data is imbalanced. When the importance of the minority class is relatively high, we can turn to the PR curve. The ROC curve pays more attention to the classification effect of the model on the overall sample. It can help us select the model and adjust the threshold of the classifier.

In short, PR curves and ROC curves are commonly used tools to evaluate binary classification models. By drawing indicators such as recall, precision, TPR, and FPR, the performance and effectiveness of the model can be evaluated from different angles.

2. Confusion matrix

In the field of machine learning, the confusion matrix is also called the possibility matrix or error matrix. The confusion matrix is a visualization tool, especially used in supervised learning. In unsupervised learning, it is generally called a matching matrix. In image accuracy evaluation, it is mainly used to compare classification results and actual measured values. The accuracy of the classification results can be displayed in a confusion matrix.

The structure of the confusion matrix is generally shown in the figure below.

The meaning of the confusion matrix is:

1. Each column of the confusion matrix represents the predicted category, and the total number of each column represents the number of data predicted for that category;
2. Each row represents the true category of the data, and the total number of data in each row represents the number of data instances of that category; the value in each column represents the number of real data predicted to be of that category.

True Positive (TP): True class. The true class of the sample is the positive class, and the result of model identification is also the positive class.

False Negative (FN): False negative class. The true class of the sample is positive, but the model identifies it as negative.

False Positive (FP): False positive class. The true class of the sample is negative, but the model identifies it as positive.

True Negative (TN): True negative class. The true class of the sample is the negative class, and the model identifies it as the negative class.

3. The difference between ROC curve and PR curve

The ROC curve and the PR curve are both curves used to evaluate the performance of the two-classification model. Their main difference lies in the evaluation indicators and applicable situations.

Evaluation Index: The horizontal axis of the ROC curve is FPR, and the vertical axis is TPR. FPR represents the proportion of negative samples that are incorrectly classified as positive classes by the model, and TPR represents the positive samples that are correctly classified as positive classes by the model. Proportion. The horizontal axis of the PR curve is the recall rate, and the vertical axis is the precision. The recall rate represents the proportion of positive samples that are correctly classified by the model, and the precision represents the proportion of true positive samples among the samples predicted as positive samples by the model.

Applicable situations: The ROC curve is suitable when the class distribution in the data set is relatively balanced or the focus is on reducing the false positive rate (ie FPR), that is, when we pay more attention to the false positive rate of the model, the ROC curve can be used to evaluate the performance of the model. The PR curve is suitable when the class distribution in the data set is unbalanced or the focus is on improving the prediction accuracy (ie, precision) of minority class samples. That is, when we pay more attention to the model’s recognition rate of minority classes, we can use the PR curve to Evaluate model performance.

In summary, ROC curves and PR curves are commonly used tools to evaluate the performance of binary classification models, but in different situations, appropriate curves must be selected for evaluation.

4. Draw PR curve and ROC curve

After understanding the corresponding concepts of recall rate and precision rate, we can draw the P-R curve. We know that when an algorithm classifies a sample, it will have a confidence level, which means the probability that the sample is a positive example. Then by selecting an appropriate threshold, the sample probability is divided. For example, when the threshold is 50%, it means that the confidence level is greater than 50%, then it is a positive example, otherwise it is a counterexample.

The same principle applies to drawing the graph here. We need to generate random probabilities to represent the probability that each sample example is a positive example, and then sort these probabilities from large to small, and then select the thresholds of each sample in this order. Examples with a probability greater than the threshold are positive examples, and all subsequent examples are negative examples. We use the probability of each sample in the data as a threshold, and then get the corresponding recall rate and precision rate, so that we can draw a graph based on a lot of data, and we give the code:

import matplotlib.pyplot as plt
import numpy as np
from matplotlib.font_manager import FontProperties


def plot(dict,lists):#Draw function image
    fig = plt.figure()
    font = FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=14)
    plt.xlabel('Recall rate (R)',fontproperties=font)
    plt.ylabel('Precision rate (P)',fontproperties=font)
    x = np.arange(0,1.0,0.2)
    y = np.arange(0,1.0,0.2)
    plt.xticks(x)
    plt.yticks(y)
    plt.plot(dict,lists)
    plt.show()


def caculate():

    num_real = 0
    #Initialize the sample label, assuming 1 is a positive example and 0 is a negative example
    trainlabel = np.random.randint(0,2,size=100)

    #Generate 100 probability values (confidence), that is, the probability that a single sample value is a positive example
    traindata = np.random.rand(100)

    # Sort the sample data as positive examples from large to small and return the index value
    sortedTraindata = traindata.argsort()[::-1]

    k = []
    v = []
    #Count the number of actual positive examples in the sample
    num = np.sum(trainlabel==1)
    for i in range(100):
        num_guess = i + 1#The number assumed to be true
        for j in range(0,i + 1):
            a = sortedTraindata[j]
            if trainlabel[a] == 1:
                num_real + = 1#The number of assumptions that are true and are actually true
        p = float(num_real/(num_guess))
        r = float(num_real/(num))
        v.append(p)
        k.append(r)
        num_real = 0
    plot(k,v)

if __name__=='__main__':
    caculate()

The curve after running is as shown in the figure:

Of course, the results of each operation are different, because the data is randomly generated, but there are still some differences between these images and those in the book, so for verification, I searched for some other people’s examples, but most of them used sklearn library to draw, so there is a more convenient way to draw:

import matplotlib.pyplot as plt
import numpy as np
from matplotlib.font_manager import FontProperties
from sklearn.metrics import precision_recall_curve

def plot(precision,recall):#Draw function image
    fig = plt.figure()
    font = FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=14)
    plt.xlabel('Recall rate (R)',fontproperties=font)
    plt.ylabel('Precision rate (P)',fontproperties=font)
    x = np.arange(0,1.0,0.2)
    y = np.arange(0,1.0,0.2)
    plt.xticks(x)
    plt.yticks(y)
    plt.plot(recall,precision)
    plt.show()

def caculate():

    #Initialize the sample label, assuming 1 is a positive example and 0 is a negative example
    trainlabel = np.random.randint(0,2,size=100)

    #Generate 100 probability values (confidence), that is, the probability that a single sample value is a positive example
    traindata = np.random.rand(100)

    precision,recall,thresholds = precision_recall_curve(trainlabel, traindata)
    #Calculate the recall rate and precision rate of different thresholds. This implementation is limited to binary classification tasks. The first parameter is the binary label, and the second parameter
    # is the estimated probability, the third parameter is the label of the positive class, the default value is 1, and the return value is p, r,
    plot(precision,recall)

if __name__=='__main__':
    caculate()

The functions in the sklearn library are used here to find the recall rate and precision rate. The corresponding introduction and parameter usage can be viewed in the official documentation. The address document address is given, and the results after running:

Drawing of ROC curve
The ROC curve is very similar to the P-R curve. We sort the samples according to the prediction results of the learner, and predict the samples as positive examples one by one in this order. Each time the values of the horizontal and vertical coordinates are calculated, the ROC curve can be obtained. , but the difference from the P-R curve is that the horizontal axis of the ROC curve uses the “false positive rate” and the vertical axis uses the “true positive rate”. We can also write their calculation expressions.

The true case rate is actually the same as the recall rate R, that is
TPR=\frac{TP}{TP + FN}

The false positive rate is
FPR=\frac{FP}{TN + FP}

For the example given above, TPR is R and FPR is

FPR=\frac{10}{10 + 20}=0.33

The denominator is actually the actual number of counterexamples, so we can draw the ROC curve:

import matplotlib.pyplot as plt
import numpy as np
from matplotlib.font_manager import FontProperties

def plot(tpr,fpr):#Draw function image
    fig = plt.figure()
    font = FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=14)
    plt.xlabel('False Positive Rate (FPR)',fontproperties=font)
    plt.ylabel('True Case Rate (TPR)',fontproperties=font)
    x = np.arange(0,1.1,0.2)
    y = np.arange(0,1.1,0.2)
    plt.xticks(x)
    plt.yticks(y)
    plt.plot(fpr,tpr)
    x1 = np.arange(0, 1.0, 0.1)
    plt.plot(x1, x1, color='blue', linewidth=2, linestyle='--')
    plt.show()


def caculate():

    tp=0
    #Initialize the sample label, assuming 1 is a positive example and 0 is a negative example
    trainlabel = np.random.randint(0,2,size=100)

    #Generate 100 probability values (confidence), that is, the probability that a single sample value is a positive example
    traindata = np.random.rand(100)

    # Sort the sample data as positive examples from large to small and return the index value
    sortedTraindata = traindata.argsort()[::-1]

    k = []
    v = []
    #Count the number of actual positive examples in the sample
    num = np.sum(trainlabel==1)
    num1 = 100 - num
    for i in range(100):
        num_guess = i + 1#The number assumed to be true
        for j in range(0,i + 1):
            a = sortedTraindata[j]
            if trainlabel[a] == 1:
                tp + = 1#The number of things that are actually true when assumed to be true
        fp = num_guess - tp
        fpr = float(fp/(num1))
        tpr = float(tp/(num))
        v.append(fpr)
        k.append(tpr)
        tp=0
    plot(k,v)

if __name__=='__main__':
    caculate()

Similarly, there are functions in the sklearn library that can be implemented for the true positive rate and false positive rate of ROC, and the implementation code is given:

import matplotlib.pyplot as plt
import numpy as np
from matplotlib.font_manager import FontProperties
from sklearn.metrics import roc_curve

def plot(fpr,tpr):#Draw function image
    fig = plt.figure()
    font = FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=14)
    plt.xlabel('False Positive Rate (FPR)',fontproperties=font)
    plt.ylabel('True Case Rate (TPR)',fontproperties=font)
    x = np.arange(0,1.1,0.2)
    y = np.arange(0,1.1,0.2)
    plt.xticks(x)
    plt.yticks(y)
    plt.plot(fpr,tpr)
    x1 = np.arange(0,1.0,0.1)
    plt.plot(x1,x1,color='blue',linewidth=2,linestyle='--')
    plt.show()

def caculate():

    #Initialize the sample label, assuming 1 is a positive example and 0 is a negative example
    trainlabel = np.random.randint(0,2,size=100)

    #Generate 100 probability values (confidence), that is, the probability that a single sample value is a positive example
    traindata = np.random.rand(100)

    fpr,tpr,thresholds = roc_curve(trainlabel,traindata)

    plot(fpr,tpr)

if __name__=='__main__':
    caculate()

operation result:

5. Summary

In the evaluation of binary classification models, PR curves and ROC curves are commonly used tools. The purpose of this experimental report is to compare the performance evaluation capabilities of PR curves and ROC curves. We analyze the performance of the model by plotting PR curves and ROC curves. The PR curve shows the relationship between recall and precision, while the ROC curve shows the relationship between true positive rate and false positive rate.

From the analysis of experimental results, we can draw the following conclusions:

  • The ROC curve is suitable for situations where the class distribution is balanced or where the focus is on reducing the false alarm rate. When we focus on the overall classification effect of the model, the ROC curve can provide a comprehensive performance evaluation.
  • The PR curve is suitable for situations where the class distribution is unbalanced or the recognition rate of minority class samples is focused. When we focus on the model’s prediction accuracy for minority classes, the PR curve can better evaluate the performance of the model.

In addition, we also need to consider the actual needs and model application scenarios to select appropriate evaluation indicators. If we pay more attention to the classification accuracy of the minority class, then the PR curve can help us find a more suitable threshold. And if we are more concerned about the overall classification performance or the false positive rate of the model, then the ROC curve can provide more comprehensive information.

In summary, PR curves and ROC curves are important performance evaluation tools that can help us fully understand the classification ability of the model. Choosing an appropriate curve depends on the class distribution, focus, and the specific needs of the model application.