paddle 2.4.2 - GPU 版本下训练LSTM时出现 Unsupported backend Undefined when casting it to paddle place type. 报错。

oxcyiej7  于 5个月前  发布在  其他
关注(0)|答案(2)|浏览(53)

bug描述 Describe the Bug

环境:
python 3.7
debain 11
cuda 11.6
paddle 2.4.2 - GPU

使用其他环境时仍然出现同样报错,其他环境有:
更改其中的 python 版本为 3.8
更改其中的 paddle 2.4.2 - GPU 为 paddle 2.3.2 - GPU 或 paddle 2.2.2 - GPU

项目:paddleVideo/FootballAction
数据集:个人数据集
指令: python -B -m paddle.distributed.launch --gpus="0" --log_dir=./football/logs_lstm main.py --validate -c applications/FootballAction/train_proposal/configs/lstm_football.yaml -o output_dir=./football/lstm

报错显示:
Traceback (most recent call last):
File "main.py", line 142, in
main()
File "main.py", line 130, in main
train_model(cfg,
out = _C_ops.full_batch_size_like(input, shape, dtype, value,
NotImplementedError: (Unimplemented) Unsupported backend Undefined when casting it to paddle place type. (at /paddle/paddle/phi/core/compat/convert_utils.cc:103)

LAUNCH INFO 2023-04-06 09:32:14,437 Exit code 1

定位代码为:

if in_dygraph_mode():
    if not isinstance(dtype, core.VarDesc.VarType):
        dtype = convert_np_dtype_to_dtype_(dtype)

    place = _current_expected_place()
    if force_cpu:
        place = core.CPUPlace()
    out = _C_ops.full_batch_size_like(input, shape, dtype, value,
                                      input_dim_idx, output_dim_idx, place)
    out.stop_gradient = True
    return out

其他补充信息 Additional Supplementary Information

LAUNCH INFO 2023-04-06 09:32:14,436 ------------------------- ERROR LOG DETAIL -------------------------
ell(2048, 2048)
(cell_bw): LSTMCell(2048, 2048)
)
)
(dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train)
(att_fc0): Linear(in_features=4096, out_features=1, dtype=float32)
(softmax): Softmax(axis=-1)
(bi_lstm1): LSTM(1024, 1024
(0): BiRNN(
(cell_fw): LSTMCell(1024, 1024)
(cell_bw): LSTMCell(1024, 1024)
)
)
(att_fc1): Linear(in_features=2048, out_features=1, dtype=float32)
(fc1): Linear(in_features=6144, out_features=8192, dtype=float32)
(bn1): BatchNorm()
(dropout1): Dropout(p=0.5, axis=None, mode=upscale_in_train)
(fc2): Linear(in_features=8192, out_features=4096, dtype=float32)
(bn2): BatchNorm()
(dropout2): Dropout(p=0.5, axis=None, mode=upscale_in_train)
(fc3): Linear(in_features=4096, out_features=21, dtype=float32)
(fc4): Linear(in_features=4096, out_features=1, dtype=float32)
)

No response

zysjyyx4

zysjyyx41#

切换到更早的paddle版本可以跑嘛 比如paddle=2.4.0

vzgqcmou

vzgqcmou2#

这个问题已经解决。解决问题是使用PaddleVideo 2.2的版本,PaddleVideo最新版本的Football的第三个模型LSTM训练时可能有点问题。 在此之前,曾经更换过2.3、2.2的paddle版本,仍一样报错。

相关问题