numpy 如何使用混淆矩阵找到查准率和查全率的值

y0u0uwnf  于 12个月前  发布在  其他
关注(0)|答案(2)|浏览(122)

我试图找到下面给出的混淆矩阵的查准率和查全率,但发生了一个错误。我如何使用Numpy和Sklearn完成它?

array([[748,   0,   4,   5,   1,  16,   9,   4,   8,   0],
       [  0, 869,   6,   5,   2,   2,   2,   5,  12,   3],
       [  6,  19, 642,  33,  13,   7,  16,  15,  31,   6],
       [  5,   3,  30, 679,   2,  44,   1,  12,  23,  12],
       [  4,   7,   9,   2, 704,   5,  10,   8,   7,  43],
       [  5,   6,  10,  39,  11, 566,  18,   4,  33,  10],
       [  6,   5,  17,   2,   5,  12, 737,   2,   9,   3],
       [  5,   7,   8,  18,  14,   2,   0, 752,   5,  42],
       [  7,  15,  34,  28,  12,  29,   6,   4, 600,  18],
       [  4,   6,   6,  16,  21,   4,   0,  50,   8, 680]], dtype=int64)

字符串

hfwmuf9z

hfwmuf9z1#

正如其他人已经推荐的那样,你可以直接计算y_actual和y_predicted using sklearn.metrics with precision_score and recall_score`来计算你需要的。阅读更多关于precisionrecall分数的信息。
但是,IIUC,你想直接用混淆矩阵来做同样的事情,下面是你如何直接使用混淆矩阵来计算查准率和查全率。
1.首先,我将使用一个虚拟示例进行演示,显示来自SKLEARN API的结果,然后直接计算它们。
注意:有两种类型的查准率和查全率通常是计算的-

  • 微精度:所有类别的所有TP相加并除以TP+FP
  • 宏观精度:分别计算每个类别的TP/TP+FP,然后取平均值(平均值)
  • 你可以在这里找到关于精确度(和召回率)类型的更多细节。

我在下面展示了两种方法,以供您理解-

import numpy as np
from sklearn.metrics import confusion_matrix, precision_score, recall_score

####################################################
#####Using SKLEARN API on TRUE & PRED Labels########
####################################################

y_true = [0, 1, 2, 2, 1, 1]
y_pred = [0, 2, 2, 2, 1, 2]
confusion_matrix(y_true, y_pred)

precision_micro = precision_score(y_true, y_pred, average="micro")
precision_macro = precision_score(y_true, y_pred, average="macro")
recall_micro = recall_score(y_true, y_pred, average='micro')
recall_macro = recall_score(y_true, y_pred, average="macro")

print("Sklearn API")
print("precision_micro:", precision_micro)
print("precision_macro:", precision_macro)
print("recall_micro:", recall_micro)
print("recall_macro:", recall_macro)

####################################################
####Calculating directly from confusion matrix######
####################################################

cf = confusion_matrix(y_true, y_pred)
TP = cf.diagonal()

precision_micro = TP.sum()/cf.sum()
recall_micro = TP.sum()/cf.sum()

#NOTE: The sum of row-wise sums of a matrix = sum of column-wise sums of a matrix = sum of all elements of a matrix
#Therefore, the micro-precision and micro-recall is mathematically the same for a multi-class problem.

precision_macro = np.nanmean(TP/cf.sum(0))
recall_macro = np.nanmean(TP/cf.sum(1))

print("")
print("Calculated:")
print("precision_micro:", precision_micro)
print("precision_macro:", precision_macro)
print("recall_micro:", recall_micro)
print("recall_macro:", recall_macro)

个字符
1.既然我已经证明了API背后的定义可以按照描述的那样工作,那么让我们来计算一下您的案例的精确度和召回率。

cf = [[748,   0,   4,   5,   1,  16,   9,   4,   8,   0],
      [  0, 869,   6,   5,   2,   2,   2,   5,  12,   3],
      [  6,  19, 642,  33,  13,   7,  16,  15,  31,   6],
      [  5,   3,  30, 679,   2,  44,   1,  12,  23,  12],
      [  4,   7,   9,   2, 704,   5,  10,   8,   7,  43],
      [  5,   6,  10,  39,  11, 566,  18,   4,  33,  10],
      [  6,   5,  17,   2,   5,  12, 737,   2,   9,   3],
      [  5,   7,   8,  18,  14,   2,   0, 752,   5,  42],
      [  7,  15,  34,  28,  12,  29,   6,   4, 600,  18],
      [  4,   6,   6,  16,  21,   4,   0,  50,   8, 680]]

cf = np.array(cf)
TP = cf.diagonal()

precision_micro = TP.sum()/cf.sum()
recall_micro = TP.sum()/cf.sum()

precision_macro = np.nanmean(TP/cf.sum(0))
recall_macro = np.nanmean(TP/cf.sum(1))

print("Calculated:")
print("precision_micro:", precision_micro)
print("precision_macro:", precision_macro)
print("recall_micro:", recall_micro)
print("recall_macro:", recall_macro)
Calculated:
precision_micro: 0.872125
precision_macro: 0.8702549015235986
recall_micro: 0.872125
recall_macro: 0.8696681555022805

的字符串

vlf7wbxs

vlf7wbxs2#

你可以使用scikit-learn来计算每个类的recallprecision
范例:

from sklearn.metrics import classification_report

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))

              precision    recall  f1-score   support

     class 0       0.50      1.00      0.67         1
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.67      0.80         3

    accuracy                           0.60         5
   macro avg       0.50      0.56      0.49         5
weighted avg       0.70      0.60      0.61         5

字符串
参考here

相关问题