python 如何获得每个特征的SHAP值?

ahy6op9u  于 2022-12-28  发布在  Python
关注(0)|答案(3)|浏览(212)

我目前正在使用SHAP库,我已经使用每个要素的平均贡献生成了图表,但是我想知道绘制在图表上的确切值

import numpy as np
import pandas as pd  
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
import shap

boston = load_boston()
regr = pd.DataFrame(boston.data)
regr.columns = boston.feature_names
regr['MEDV'] = boston.target

X = regr.drop('MEDV', axis = 1)
Y = regr['MEDV']

fit = LinearRegression().fit(X, Y)

explainer = shap.LinearExplainer(fit, X, feature_dependence = 'independent')
# I used 'independent' because the result is consistent with the ordinary 
# shapely values where `correlated' is not

shap_values = explainer.shap_values(X)

shap.summary_plot(shap_values, X, plot_type = 'bar')

如何获得图表中描述的确切值?

wpcxdonn

wpcxdonn1#

pd.DataFrame((zip(X.columns[np.argsort(np.abs(shap_values).mean(0))],
np.abs(shap_values).mean(0))), columns=["feature", "importance" ]).sort_values(by="importance", ascending=False)

参考GitHub

brvekthn

brvekthn2#

试试看:

features = X.columns
mask = np.abs(shap_values).mean(0).argsort()[::-1]
features[mask]
Index(['LSTAT', 'DIS', 'RAD', 'RM', 'TAX', 'PTRATIO', 'NOX', 'ZN', 'CRIM', 'B',
       'CHAS', 'INDUS', 'AGE'],
      dtype='object')
avwztpqn

avwztpqn3#

所选答案错误,标签不正确。
以下是更正后的版本:

pd.DataFrame((zip(X.columns[np.argsort(np.abs(shap_values).mean(0))][::-1],
-np.sort(-np.abs(shap_values).mean(0)))),
columns=["feature", "importance"])

相关问题