我试图在这里进行多类分类,交叉验证的得分都是nan
下面的代码非常适合二进制分类
当我只保留accuracy和balanced_accuracy时,它显示实际得分,当我添加f1或precison或recall时,所有得分都变成nan
我的代码在二进制分类中工作得很好,而我使用的是同一个数据集,这个问题只是改变了目标数据
scoring = {'accuracy': 'accuracy',
"balanced_accuracy": "balanced_accuracy",
"precision": "precision",
"recall": "recall",
"f1" :"f1",
"roc_auc":"roc_auc" }
# load the dataset
def load_dataset(df):
# load the dataset as a numpy array
data = df
# retrieve numpy array
data = data.values
# split into input and output elements
X, y = data[:, :-1], data[:, -1]
y = LabelEncoder().fit_transform(y)
return X, y
# evaluate a model
def evaluate_model(X, y, model):
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_validate(model, X, y, scoring=scoring, cv=cv, n_jobs=-1)
return scores
model=DecisionTreeClassifier()
# define the location of the dataset
# load the dataset
X, y = load_dataset(df2)
# evaluate the model and store results
results_without_nlp = evaluate_model(X, y, model)
我试过用那些from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
但这似乎没有帮助
1条答案
按热度按时间yzuktlbb1#
对于precision_score、recall_score和f1_score,我认为可以尝试对多个目标使用参数
average = micro (or macro and weighted)
,因为它的默认值是binary
。