Paddle 优化时报错

2lpgd968  于 2022-10-20  发布在  其他
关注(0)|答案(3)|浏览(201)

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 323, in
main()
File "tools/train.py", line 233, in main
outs = exe.run(compiled_train_prog, fetch_list=train_values)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 783, in run
six.reraise(*sys.exc_info())
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 778, in run
use_program_cache=use_program_cache)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 843, in _run_impl
return_numpy=return_numpy)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 677, in _run_parallel
tensors = exe.run(fetch_var_names)._move_to_list()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::framework::OperatorWithKernel::IndicateVarDataType(paddle::framework::ExecutionContext const&, std::string const&) const
3 paddle::operators::ConcatOpGrad::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const
4 paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, paddle::platform::Place const&) const
5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
6 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
7 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
8 paddle::framework::details::ComputationOpHandle::RunImpl()
9 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*)
10 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase*, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*)
11 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&)
12 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
13 ThreadPool::ThreadPool(unsigned long)::{lambda() #1 }::operator()() const

Python Call Stacks (More useful to users):

File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args,**kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/tensor.py", line 286, in concat
type='concat', inputs=inputs, outputs={'Out': [out]}, attrs=attrs)
File "/home/aistudio/data/PaddleDetection-release-0.2/ppdet/modeling/anchor_heads/yolact_head.py", line 1010, in lincomb_mask_loss
label_t = fluid.layers.concat(label_t_list, axis=0)
File "/home/aistudio/data/PaddleDetection-release-0.2/ppdet/modeling/anchor_heads/yolact_head.py", line 711, in get_loss
loss, maskiou_targets = self.lincomb_mask_loss(pos, idx_t, mask_data, proto_data, gt_mask, gt_box_t, labels, gt_num, batch_size, num_priors)
File "/home/aistudio/data/PaddleDetection-release-0.2/ppdet/modeling/architectures/yolactplus.py", line 88, in build
gt_box, gt_class, gt_segm, is_crowd, gt_num)
File "/home/aistudio/data/PaddleDetection-release-0.2/ppdet/modeling/architectures/yolactplus.py", line 98, in train
return self.build(feed_vars, 'train')
File "tools/train.py", line 116, in main
train_fetches = model.train(feed_vars)
File "tools/train.py", line 323, in
main()

Error Message Summary:

Error: The Input Variable(Out@GRAD) of concat_grad Op used to determine kernel data type is empty or not LoDTensor or SelectedRows.
[Hint: Expected data_type != dafault_data_type, but received data_type:-1 == dafault_data_type:-1.] at (/paddle/paddle/fluid/framework/operator.cc:1303)
[operator < concat_grad > error]
terminate called without an active exception
W0320 15:23:44.711206 13508 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0320 15:23:44.711257 13508 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0320 15:23:44.711261 13508 init.cc:214] The detail failure signal is:

W0320 15:23:44.711277 13508 init.cc:217]Aborted at 1584689024 (unix time) try "date -d @1584689024" if you are using GNU date
W0320 15:23:44.713604 13508 init.cc:217] PC: @ 0x0 (unknown)
W0320 15:23:44.713701 13508 init.cc:217]***SIGABRT (@0x3e800003489) received by PID 13449 (TID 0x7f65d46b8700) from PID 13449; stack trace:***
W0320 15:23:44.715577 13508 init.cc:217] @ 0x7f66034bb390 (unknown)
W0320 15:23:44.717382 13508 init.cc:217] @ 0x7f6603115428 gsignal
W0320 15:23:44.719118 13508 init.cc:217] @ 0x7f660311702a abort
W0320 15:23:44.720266 13508 init.cc:217] @ 0x7f65c408084a __gnu_cxx::__verbose_terminate_handler()
W0320 15:23:44.721199 13508 init.cc:217] @ 0x7f65c407ef47 __cxxabiv1::__terminate()
W0320 15:23:44.722306 13508 init.cc:217] @ 0x7f65c407ef7d std::terminate()
W0320 15:23:44.723291 13508 init.cc:217] @ 0x7f65c407ec5a __gxx_personality_v0
W0320 15:23:44.724177 13508 init.cc:217] @ 0x7f65c4371b97 _Unwind_ForcedUnwind_Phase2
W0320 15:23:44.725060 13508 init.cc:217] @ 0x7f65c4371e7d _Unwind_ForcedUnwind
W0320 15:23:44.726830 13508 init.cc:217] @ 0x7f66034ba070 __GI___pthread_unwind
W0320 15:23:44.728565 13508 init.cc:217] @ 0x7f66034b2845 __pthread_exit
W0320 15:23:44.729019 13508 init.cc:217] @ 0x55e36f45ee59 PyThread_exit_thread
W0320 15:23:44.729168 13508 init.cc:217] @ 0x55e36f2e4c17 PyEval_RestoreThread.cold.798
W0320 15:23:44.730978 13508 init.cc:217] @ 0x7f659d1dc779 pybind11::gil_scoped_release::~gil_scoped_release()
W0320 15:23:44.731444 13508 init.cc:217] @ 0x7f659d189134 ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL22pybind11_init_core_avxERNS_6moduleEEUlRNS2_9operators6reader22LoDTensorBlockingQueueERKSt6vectorINS2_9framework9LoDTensorESaISC_EEE60_bIS9_SG_EINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESY
W0320 15:23:44.733130 13508 init.cc:217] @ 0x7f659d1fab21 pybind11::cpp_function::dispatcher()
W0320 15:23:44.733628 13508 init.cc:217] @ 0x55e36f3e0744 _PyMethodDef_RawFastCallKeywords
W0320 15:23:44.734063 13508 init.cc:217] @ 0x55e36f3e0861 _PyCFunction_FastCallKeywords
W0320 15:23:44.734508 13508 init.cc:217] @ 0x55e36f44c6e8 _PyEval_EvalFrameDefault
W0320 15:23:44.734911 13508 init.cc:217] @ 0x55e36f39081a _PyEval_EvalCodeWithName
W0320 15:23:44.735327 13508 init.cc:217] @ 0x55e36f391635 _PyFunction_FastCallDict
W0320 15:23:44.735759 13508 init.cc:217] @ 0x55e36f449232 _PyEval_EvalFrameDefault
W0320 15:23:44.736145 13508 init.cc:217] @ 0x55e36f3dfccb _PyFunction_FastCallKeywords
W0320 15:23:44.736583 13508 init.cc:217] @ 0x55e36f447a93 _PyEval_EvalFrameDefault
W0320 15:23:44.736985 13508 init.cc:217] @ 0x55e36f3dfccb _PyFunction_FastCallKeywords
W0320 15:23:44.737432 13508 init.cc:217] @ 0x55e36f447a93 _PyEval_EvalFrameDefault
W0320 15:23:44.737835 13508 init.cc:217] @ 0x55e36f39156b _PyFunction_FastCallDict
W0320 15:23:44.738241 13508 init.cc:217] @ 0x55e36f3afe53 _PyObject_Call_Prepend
W0320 15:23:44.738678 13508 init.cc:217] @ 0x55e36f3a2dbe PyObject_Call
W0320 15:23:44.738860 13508 init.cc:217] @ 0x55e36f49f817 t_bootstrap
W0320 15:23:44.738970 13508 init.cc:217] @ 0x55e36f45a788 pythread_wrapper
W0320 15:23:44.740844 13508 init.cc:217] @ 0x7f66034b16ba start_thread
Aborted (core dumped)

cczfrluj

cczfrluj1#

不优化时正常,优化时出错

mrfwxfqh

mrfwxfqh2#

麻烦贴下代码或者代码链接吧,感觉可能是因为由于一些 stop_gradient 的使用引起的

6qqygrtg

6qqygrtg3#

#23140 我把改了下把详细的放到了这里

相关问题