- 版本、环境信息:
- paddlepaddle-gpu 1.5.x
- 百度AI Studio的在线环境(GPU版 V100)
代码来自场景文字识别的OCR Attention:
https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ocr_recognition
attention model
在运行 env CUDA_VISIBLE_DEVICES=0 python ./example/train.py --use_gpu=True --skip_test=True --save_model_dir="./models_attention" --model="attention"
出现错误
----------- Configuration Arguments -----------
average_window: 0.15
batch_size: 16
eval_period: 15000
init_model: None
log_period: 1000
max_average_window: 12500
min_average_window: 10000
model: attention
parallel: False
profile: False
save_model_dir: ./models_attention
save_model_period: 15000
skip_batch_num: 0
skip_test: 1
test_images: work/dataset/ocr_chinese/train_images
test_list: train.list
total_step: 720000
train_images: work/dataset/ocr_chinese/train_images
train_list: train.list
use_gpu: 1
------------------------------------------------
/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/evaluator.py:71: Warning: The EditDistance is deprecated, because maintain a modified program inside evaluator cause bug easily, please use fluid.metrics.EditDistance instead.
% (self.__class__.__name__, self.__class__.__name__), Warning)
finish batch shuffle
W0729 15:06:09.451383 136 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0729 15:06:09.455503 136 device_context.cc:267] device: 0, cuDNN Version: 7.3.
Traceback (most recent call last):
File "./example/train.py", line 222, in <module>
main()
File "./example/train.py", line 218, in main
train(args)
File "./example/train.py", line 151, in train
results = train_one_batch(data)
File "./example/train.py", line 112, in train_one_batch
fetch_list=fetch_vars)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/executor.py", line 650, in run
use_program_cache=use_program_cache)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/executor.py", line 748, in _run
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet: Invoke operator edit_distance error.
Python Callstacks:
File "/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/framework.py", line 1748, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args,**kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/layers/nn.py", line 5392, in edit_distance
attrs={"normalized": normalized})
File "/opt/conda/envs/python35-paddle120-env/lib/python3.5/site-packages/paddle/fluid/evaluator.py", line 261, in __init__
input=input, label=label, ignored_tokens=ignored_tokens)
File "/home/aistudio/example/attention_model.py", line 187, in attention_train_net
input=maxid, label=label_out, ignored_tokens=[sos, eos])
File "./example/train.py", line 61, in train
args, data_shape, num_classes)
File "./example/train.py", line 218, in main
train(args)
File "./example/train.py", line 222, in <module>
main()
C++ Callstacks:
Reference string 16 is empty. at [/paddle/paddle/fluid/operators/edit_distance_op.cu:92]
PaddlePaddle Call Stacks:
0 0x7ffb8569d2e0p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 352
1 0x7ffb8569d659p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 137
2 0x7ffb85b28afap paddle::operators::EditDistanceGPUKernel<paddle::platform::CUDAPlace, float>::Compute(paddle::framework::ExecutionContext const&) const + 4938
3 0x7ffb85b28f23p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::EditDistanceGPUKernel<paddle::platform::CUDAPlace, float> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 35
4 0x7ffb875f8657p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 375
5 0x7ffb875f8a31p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 529
6 0x7ffb875f602cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332
7 0x7ffb8582747ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382
8 0x7ffb8582a51fp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 143
9 0x7ffb8568e96dp
10 0x7ffb856cfca6p
11 0x7ffbea152199p PyCFunction_Call + 233
12 0x7ffbea1ed3f9p PyEval_EvalFrameEx + 33545
13 0x7ffbea1ef4b6p
14 0x7ffbea1ec5b5p PyEval_EvalFrameEx + 29893
15 0x7ffbea1ef4b6p
16 0x7ffbea1ec5b5p PyEval_EvalFrameEx + 29893
17 0x7ffbea1ef4b6p
18 0x7ffbea1ec5b5p PyEval_EvalFrameEx + 29893
19 0x7ffbea1ef4b6p
20 0x7ffbea1ec5b5p PyEval_EvalFrameEx + 29893
21 0x7ffbea1ed1d0p PyEval_EvalFrameEx + 32992
22 0x7ffbea1ef4b6p
23 0x7ffbea1ef5a8p PyEval_EvalCodeEx + 72
24 0x7ffbea1ef5ebp PyEval_EvalCode + 59
25 0x7ffbea21fa02p PyRun_FileExFlags + 178
26 0x7ffbea21fb67p PyRun_SimpleFileExFlags + 231
27 0x7ffbea23cd2cp Py_Main + 3676
28 0x400b54p main + 356
29 0x7ffbe91b0830p __libc_start_main + 240
30 0x400c01p
3条答案
按热度按时间gajydyqb1#
我再补充下程序部分:
我只修改了
NUM_CLASSES = 95
为NUM_CLASSES = 3920
DATA_SHAPE = [1, 48, 512]
为DATA_SHAPE = [1, 48, 2560]
数据集换了一下(官方的中文数据集,height和crnn_ctc的默认数据集一样都是48)
train.list的数据我也改成了原先程序需要的数字序列格式(字符Map为数字)
u5rb5r592#
log显示您label设置的lod不对,你可以在这句之前添加如下语句:
Print op
麻烦您将添加上述语句后执行打印的log贴到这里?
jpfvwuh43#