ROC Curve and AUC — Explained

栏目: IT技术 · 发布时间: 4年前

内容简介:What they mean and when they are usefulROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models.ROC curvesummarizes the performance by combining

ROC Curve and AUC — Explained

What they mean and when they are useful

Photo by Markus Spiske on Unsplash

ROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models.

ROC curvesummarizes the performance by combining confusion matrices at all threshold values. AUC turns the ROC curve into a numeric representation of performance for a binary classifier. AUC is the area under the ROC curve and takes a value between 0 and 1. AUC indicates how successful a model is at separating positive and negative classes.

Before going in detail, let’s first explain the confusion matrix and how different threshold values change the outcome of it.

A confusion matrix is not a metric to evaluate a model, but it provides insight into the predictions. Confusion matrix goes deeper than classification accuracy by showing the correct and incorrect (i.e. true or false) predictions on each class. In case of a binary classification task, a confusion matrix is a 2×2 matrix. If there are three different classes, it is a 3×3 matrix and so on.

Confusion matrix of a binary classification (Image by author)

Let’s assume class A is positive class and class B is negative class. The key terms of confusion matrix are as follows:

  • True positive (TP) : Predicting positive class as positive (ok)
  • False positive (FP) : Predicting negative class as positive (not ok)
  • False negative (FN) : Predicting positive class as negative (not ok)
  • True negative (TN) : Predicting negative class as negative (ok)

Algorithms like logistic regression return probabilities rather than discrete outputs. We set a threshold value on the probabilities to distinguish positive and negative class. Depending on the threshold value, the predicted class of some observations may change.

How threshold value can change the predicted class (Image by author)

As we can see from the image above, adjusting the threshold value changes the prediction and thus results in a different confusion matrix. When the elements in a confusion matrix change, precision and recall also change.

Precision and recall metrics take the classification accuracy one step further and allow us to get a more specific understanding of model evaluation.

The focus of precision is positive predictions . It indicates how many of the positive predictions are true.

The focus of recall is actual positive classes . It indicates how many of the positive classes the model is able to predict correctly.

Note: We cannot try to maximize both precision and recall because there is a trade-off between them. Increasing precision decreases recall and vice versa. We can aim to maximize precision or recall depending on the task. For an email spam detection model, we try to maximize precision because we want to be correct when an email is detected as spam. We do not want to label a normal email as spam (i.e. false positive). On the other hand, for a tumor detection task, we need to maximize recall because we want to detect positive classes as much as possible.

What ROC curve does is providing us a summary of the performance of a model by combining confusion matrices at all threshold values.

ROC Curve ( image source )

ROC curve has two axes both of which take values between 0 and 1. Y-axis is true positive rate (TPR) which is also known as sensitivity . It is the same as recall which measures the proportion of positive class that is correctly predicted as positive. X-axis is false positive rate (FPR). It is equal to 1-specificity which is similar to sensitivity but focused on negative class. Specificity measures the proportion of negative class that is correctly predicted as negative.

Sensitivity vs Specificity (Image by author)
TPR and FPR in terms of confusion matrix elements (Image by author)

If the threshold is set to 0, the model predicts all samples as positive. Thus, we only have true positives and false positives. In this case, Both TPR and FPR are 1. If the threshold is set to 1, we do not have any positive predictions. In this case TP and FP are 0 and so TPR and FPR become 0. Hence, it is not a good choice to set the threshold to 0 or 1.

We aim to increase the true positive rate (TPR) while keeping false positive rate (FPR) low. As we can see on the ROC curve, as TPR increases, FPR also increases. So it comes down to decide how many false positives we can tolerate.

ROC curve gives as an overview of model performance at different threshold values. AUC is the area under ROC curve between (0,0) and (1,1) which can be calculated using integral calculus. AUC basically aggregates the performance of the model at all threshold values. The best possible value of AUC is 1 which indicates a perfect classifier. AUC is zero if all the predictions are wrong.

Note: AUC is not dependent on classification threshold value. Changing the threshold value does not change AUC because it is an aggregate measure of ROC.

AUC for two different classifiers ( Image source )

The figure above shows the ROC curves for classifiers A and B. A is clearly a better classifier than B. The AUC is higher and for same FPR values, A has a higher TPR. Similarly, for same TPR values, A has a smaller FPR.

AUC is classification-threshold invariant. For this very reason, it is not the optimal metric of evaluation for certain tasks. For instance, when working on email spam detection, we do not want to have any false positives. On the other hand, we cannot afford to have a false negative for tumor detection tasks. We can optimize the model to our needs by adjusting the classification threshold value in such cases. Since AUC is not affected by threshold value, it is not a good metric choice. Precision or recall should be used as evaluation metric for those cases.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

计算机网络(第7版)

计算机网络(第7版)

谢希仁 / 电子工业出版社 / 2017-1 / 45.00

本书自1989年首次出版以来,曾于1994年、1999年、2003年、2008年和2013年分别出了修订版。在2006年本书通过了教育部的评审,被纳入普通高等教育“十一五”国家级规划教材;2008年出版的第5版获得了教育部2009年精品教材称号。2013年出版的第6版是“十二五”普通高等教育本科国家级规划教材。 目前2017年发行的第7版又在第6版的基础上进行了一些修订。 全书分为9章,比较......一起来看看 《计算机网络(第7版)》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换