python-3.x LightGBM致命长度,(y)标签与数据形状不相同,显示匹配正常

ltskdhd1  于 2023-03-31  发布在  Python
关注(0)|答案(1)|浏览(174)

问题

开发一个LightGBM机器学习模型,形状匹配,固定数据匹配形状。问题是标签的致命长度与数据不相同。寻找有关如何解决LightGBM标签致命长度的建议。

数据形状

print(y_train.shape)
print(X_train.shape)
print(X_test.shape)

y_train: (1019169,)
X_train: (1019169, 12)
X_test: (1019169, 12)

但是,如果使用形状[-1],则以下形状

print(y_train.shape[-1])
print(X_train.shape[-1])
print(X_test.shape[-1])

y_train: 1019169
X_train: 12
X_test: 12

数据来自Kaggle纽约市出租车数据集。
验证码:

lgb_params = {
    'metric': 'rmse',
    'num_leaves': 31, 
    'objective': 'binary',
    'is_training_metric': True}

X_train = X_train[['passenger_count', 'pickup_longitude', 'pickup_latitude', 'dropoff_latitude', 'trip_duration', 'store_fwd_flag', 'direction', 'month', 'week', 'weekday', 'hour', 'minute_oftheday']]
X_test = X_train[['passenger_count', 'pickup_longitude', 'pickup_latitude', 'dropoff_latitude', 'trip_duration', 'store_fwd_flag', 'direction', 'month', 'week', 'weekday', 'hour', 'minute_oftheday']]

#lgb.Dataset.subset(used_columns: List[str]) -> lgb.Dataset
lgb_train = lgb.Dataset(X_train, y_train)
lgb_test = lgb.Dataset(X_test, y_test)
lgb_model = lgb.train(lgb_params, lgb_train, num_boost_round=10, 
                      valid_sets=[lgb_train, lgb_test], 
                      early_stopping_rounds=6)

错误跟踪如下所示:

[LightGBM] [Warning] Contains only one class
[LightGBM] [Info] Number of positive: 1019169, number of negative: 0
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.060011 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1606
[LightGBM] [Info] Number of data points in the train set: 1019169, number of used features: 12
[LightGBM] [Fatal] Length of label is not same with #data

预期成果:LightGBM训练模型和MSE性能评估

brccelvz

brccelvz1#

已解决问题,y_train不正确。将y_train更新为train/test拆分后的正确形状。

相关问题