问题
开发一个LightGBM机器学习模型,形状匹配,固定数据匹配形状。问题是标签的致命长度与数据不相同。寻找有关如何解决LightGBM标签致命长度的建议。
数据形状
print(y_train.shape)
print(X_train.shape)
print(X_test.shape)
y_train: (1019169,)
X_train: (1019169, 12)
X_test: (1019169, 12)
但是,如果使用形状[-1],则以下形状
print(y_train.shape[-1])
print(X_train.shape[-1])
print(X_test.shape[-1])
y_train: 1019169
X_train: 12
X_test: 12
数据来自Kaggle纽约市出租车数据集。
验证码:
lgb_params = {
'metric': 'rmse',
'num_leaves': 31,
'objective': 'binary',
'is_training_metric': True}
X_train = X_train[['passenger_count', 'pickup_longitude', 'pickup_latitude', 'dropoff_latitude', 'trip_duration', 'store_fwd_flag', 'direction', 'month', 'week', 'weekday', 'hour', 'minute_oftheday']]
X_test = X_train[['passenger_count', 'pickup_longitude', 'pickup_latitude', 'dropoff_latitude', 'trip_duration', 'store_fwd_flag', 'direction', 'month', 'week', 'weekday', 'hour', 'minute_oftheday']]
#lgb.Dataset.subset(used_columns: List[str]) -> lgb.Dataset
lgb_train = lgb.Dataset(X_train, y_train)
lgb_test = lgb.Dataset(X_test, y_test)
lgb_model = lgb.train(lgb_params, lgb_train, num_boost_round=10,
valid_sets=[lgb_train, lgb_test],
early_stopping_rounds=6)
错误跟踪如下所示:
[LightGBM] [Warning] Contains only one class
[LightGBM] [Info] Number of positive: 1019169, number of negative: 0
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.060011 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1606
[LightGBM] [Info] Number of data points in the train set: 1019169, number of used features: 12
[LightGBM] [Fatal] Length of label is not same with #data
预期成果:LightGBM训练模型和MSE性能评估
1条答案
按热度按时间brccelvz1#
已解决问题,y_train不正确。将y_train更新为train/test拆分后的正确形状。