内容简介:Softmax predicion scores are often used as a confidence score in a multi-class classification setting. In this post, we are going to show that softmax scores can be meaningless when doing regular empirical risk minimization by gradient descent. We are also
Softmax predicion scores are often used as a confidence score in a multi-class classification setting. In this post, we are going to show that softmax scores can be meaningless when doing regular empirical risk minimization by gradient descent. We are also going to apply the method presented in Deep Anomaly Detection with Outlier Exposure to mitigate this problem and add more meaning to the softmax score.
Discriminative classifiers (models that try to estimate P(y|x) from data) tend to be overconfident in their predictions, even if the input sample looks nothing like anything they have seen in the training phase. This makes it so that the output scores of such models cannot be reliably used as a confidence score since the model is often confident where it should not be.
Example :
In this synthetic example, we have one big cluster of class zero and another one for class one, plus two smaller groups of points of outliers that were not present in the training set.
If we apply a regular classifier to this we get something like this :
We see that the classifier is overly confident everywhere, even the outlier samples are classified with a very high score. The confidence score is displayed using the heat-map .
This is what makes it so it is not a good idea to directly use the softmax scores as confidence scores, if a classifier is confident everywhere without having seen any evidence to support it in the training then it probably means that the confidence scores are wrong.
However, if we use the approach presented in Deep Anomaly Detection with Outlier Exposure we can achieve much more reasonable Softmax scores :
This score map is much more reasonable and is useful to see where the model is rightly confident and where it is not. The outlier region has a very low confidence ~0.5 ( Equivalent to no confidence at all in a two-class setting).
Description of the Approach
The idea presented in Deep Anomaly Detection with Outlier Exposure is to use external data that is mostly different from your training/test data and force the model to predict the uniform distribution on this external data.
For example, if you are trying to build a classifier that predicts cat vs dog in images, you can get a bunch of bear and shark images and force the model to predict [0.5, 0.5] on those images.
Data And Model
We will use the 102 Flower as the in-distribution dataset and a subset of the OpenImage dataset as an out-of-distribution dataset. In the paper referenced in the introduction, they show that training on one set of out-of-distribution samples generalizes well to other sets that are out-of-distribution.
We use MobilenetV2 as our classification architecture and initialize the weights with Imagenet.
def get_model_classification(
input_shape=(None, None, 3),
weights="imagenet",
n_classes=102,
):
inputs = Input(input_shape)
base_model = MobileNetV2(
include_top=False, input_shape=input_shape, weights=weights
) x = base_model(inputs)
x = Dropout(0.5)(x)
out1 = GlobalMaxPooling2D()(x)
out2 = GlobalAveragePooling2D()(x)
out = Concatenate(axis=-1)([out1, out2])
out = Dropout(0.5)(out)
out = Dense(n_classes, activation="softmax")(out)
model = Model(inputs, out)
model.compile(
optimizer=Adam(0.0001), loss=categorical_crossentropy, metrics=["acc"]
) return model
We will use a generator to load the images from the hard drive batch by batch. In the baseline we only load the in-distribution images while in the Anomaly exposure model we load half the batch from in-distribution images with their correct label and the other half from out-of-distribution images with a uniform objective => :
target = [1 / n_label for _ in range(n_label)]
Results
Both training configurations get a little higher than 90% accuracy on in-distribution samples. We choose to predict “Don’t Know” if the softmax score is lower than 0.15 and thus abstain from making a class prediction.
Now let us see how each model behaves!
Regular training :
You can run the web application by doing :
streamlit run main.py
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Perl最佳实践
康韦 / Taiwan公司 / 东南大学出版社 / 2008-3 / 78.00元
《Perl最佳实践》中所有的规则都是为了写出清晰、健壮、高效、可维护和简洁的程序而设计。Conway博士并不自诩这些规则是最广泛和最清晰的实践集,但实际上,《Perl最佳实践》确实提供了在实践中被广泛认可和应用的建议,而不是象牙塔似的编程理论。许多程序员凭直觉来编程,这些直觉来自于他们早期养成的习惯和风格。这样写出的程序似乎自然、直观,而且看起来也很不错。但是,如果你想严肃地对待程序员这份职业,那......一起来看看 《Perl最佳实践》 这本书的介绍吧!
在线进制转换器
各进制数互转换器
Markdown 在线编辑器
Markdown 在线编辑器