我是Python新手,在运行xgBoost时遇到了这个错误:xgboost.core.XGBoostError: [15:49:05] C:/Users/Administrator/workspace/xgboost-win64_release_1.3.0/src/learner.cc:567: Check failed: mparam_.num_feature != 0 (0 vs. 0) : 0 feature is supplied. Are you using raw Booster interface?
我试着搜索这个错误,找不到太多有用的资源。
我猜错误发生在预测阶段。但我不确定。我的数据集由两列组成:["Posts Frequency","Likes Count"]
如下所示。
这是我的代码:
# Load Dependencies
import pandas as pd
from numpy import where
import matplotlib.pyplot as plt
import numpy as np
from numpy import unique
from sklearn import metrics
import warnings
warnings.filterwarnings('ignore')
# Load the Data
# Define Columns
names = ["Posts Frequency","Likes Count"]
data = pd.read_csv("RANKING TEST (1).csv", encoding="utf-8", sep=";", delimiter=None,
names=names, delim_whitespace=False,
header=0, engine="python")
X = data.values[:,0:1]
y = data.values[:,1]
# Training
from sklearn.model_selection import GroupShuffleSplit
gss = GroupShuffleSplit(test_size=.20, n_splits=1, random_state = 7).split(data, groups=data['Posts Frequency'])
X_train_inds, X_test_inds = next(gss)
train_data= data.iloc[X_train_inds]
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
y_train = train_data.loc[:, train_data.columns.isin(['Likes Count'])]
test_data= data.iloc[X_test_inds]
X_test = test_data.loc[:, ~test_data.columns.isin(['Likes Count'])]
y_test = test_data.loc[:, test_data.columns.isin(['Likes Count'])]
from sklearn.model_selection import GroupShuffleSplit
gss = GroupShuffleSplit(test_size=.20, n_splits=1, random_state = 7).split(data, groups=data['Posts Frequency'])
X_train_inds, X_test_inds = next(gss)
train_data= data.iloc[X_train_inds]
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
y_train = train_data.loc[:, train_data.columns.isin(['Likes Count'])]
groups = train_data.groupby('Posts Frequency').size().to_frame('size')['size'].to_numpy()
test_data= data.iloc[X_test_inds]
X_test = test_data.loc[:, ~test_data.columns.isin(['Likes Count'])]
y_test = test_data.loc[:, test_data.columns.isin([' Likes Count'])]
import xgboost as xgb
model = xgb.XGBRanker(
tree_method='gpu_hist',
booster='gbtree',
objective='rank:pairwise',
random_state=42,
learning_rate=0.1,
colsample_bytree=0.9,
eta=0.05,
max_depth=6,
n_estimators=110,
subsample=0.75
)
model.fit(X_train, y_train, group=groups, verbose=True)
# make predictions
def predict(model, data):
return model.predict( data.loc[:, ~data.columns.isin( ['Posts Frequency'] )] )
predictions = (data.groupby( 'Posts Frequency' )
.apply( lambda x: predict( model, x ) ))
有人能帮我吗?
感谢您发送编修。
索菲亚
1条答案
按热度按时间tzcvj98z1#
你需要传递至少一个特征到xgBoost。也许可以在这里检查你在做什么: