跑官方的PointNet++训练模型时Warning: PaddlePaddle catches a failure signal, it may not work properly

5anewei6  于 2022-10-20  发布在  Perl
关注(0)|答案(4)|浏览(216)

1)PaddlePaddle版本:1.7.1
2)CUDA:9.0
3)cuDNN:7.3
4)系统环境:Linux16.4、Python版本2.7
安装方式信息:
1)pip安装
2)models:release/1.7
问题:在跑跑官方的PointNet++训练模型时,按官方步骤操作到运行 sh scripts/train_cls.sh 后输出以下报错信息,然后程序停止,没看懂报的是什么错,以及怎么解决呢?
报错信息:

(tensorflow) xz@xz:~/models-release-1.7/PaddleCV/3d_vision/PointNet++$ sh scripts/train_cls.sh
2020-04-03 15:10:47,093-INFO: ----------- Configuration Arguments -----------
2020-04-03 15:10:47,093-INFO: batch_size: 4
2020-04-03 15:10:47,093-INFO: bn_momentum: 0.99
2020-04-03 15:10:47,094-INFO: data_dir: dataset/ModelNet40/modelnet40_ply_hdf5_2048
2020-04-03 15:10:47,094-INFO: decay_steps: 12500
2020-04-03 15:10:47,094-INFO: enable_ce: False
2020-04-03 15:10:47,094-INFO: epoch: 10
2020-04-03 15:10:47,094-INFO: log_interval: 1
2020-04-03 15:10:47,094-INFO: lr: 0.01
2020-04-03 15:10:47,094-INFO: lr_decay: 0.7
2020-04-03 15:10:47,094-INFO: model: MSG
2020-04-03 15:10:47,094-INFO: num_classes: 40
2020-04-03 15:10:47,094-INFO: num_points: 4096
2020-04-03 15:10:47,094-INFO: resume: None
2020-04-03 15:10:47,094-INFO: save_dir: checkpoints_cls
2020-04-03 15:10:47,094-INFO: use_gpu: True
2020-04-03 15:10:47,094-INFO: weight_decay: 1e-05
2020-04-03 15:10:47,094-INFO: ------------------------------------------------
W0403 15:10:47.322461 3862 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0403 15:10:47.322497 3862 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0403 15:10:47.322506 3862 init.cc:214] The detail failure signal is:

W0403 15:10:47.322515 3862 init.cc:217]***Aborted at 1585897847 (unix time) try "date -d @1585897847" if you are using GNU date***
W0403 15:10:47.324349 3862 init.cc:217] PC: @ 0x0 (unknown)
W0403 15:10:47.324578 3862 init.cc:217]***SIGILL (@0x7eff833e1b35) received by PID 3862 (TID 0x7effc1f9d740) from PID 18446744071616469813; stack trace:***
W0403 15:10:47.326310 3862 init.cc:217] @ 0x7effc1d52330 (unknown)
W0403 15:10:47.327126 3862 init.cc:217] @ 0x7eff833e1b35 paddle::framework::OpDesc::SetAttrMap()
W0403 15:10:47.327630 3862 init.cc:217] @ 0x7eff841e8a04 paddle::operators::GroupPointsGradDescMaker<>::Apply()
W0403 15:10:47.328037 3862 init.cc:217] @ 0x7eff841c57e3 paddle::framework::SingleGradOpMaker<>::operator()()
W0403 15:10:47.328611 3862 init.cc:217] @ 0x7eff841e6313 ZZNK6paddle9framework7details12OpInfoFillerINS_9operators24GroupPointsGradDescMakerINS0_6OpDescEEELNS1_14OpInfoFillTypeE2EEclEPKcPNS0_6OpInfoEENKUlRKS5_RKSt13unordered_setISsSt4hashISsESt8equal_toISsESaISsEEPSt13unordered_mapISsSsSH_SJ_SaISt4pairIKSsSsEEERKSt6vectorIPNS0_9BlockDescESaISX_EEE_clESE_SN_SU_S11
W0403 15:10:47.329150 3862 init.cc:217] @ 0x7eff841e7b2e ZNSt17_Function_handlerIFSt6vectorISt10unique_ptrIN6paddle9framework6OpDescESt14default_deleteIS4_EESaIS7_EERKS4_RKSt13unordered_setISsSt4hashISsESt8equal_toISsESaISsEEPSt13unordered_mapISsSsSE_SG_SaISt4pairIKSsSsEEERKS0_IPNS3_9BlockDescESaIST_EEEZNKS3_7details12OpInfoFillerINS2_9operators24GroupPointsGradDescMakerIS4_EELNSZ_14OpInfoFillTypeE2EEclEPKcPNS3_6OpInfoEEUlSB_SK_SR_SX_E_E9_M_invokeERKSt9_Any_dataSB_SK_SR_SX
W0403 15:10:47.329793 3862 init.cc:217] @ 0x7eff833a85a0 PD_GetGradOpDescStrs
W0403 15:10:47.330160 3862 init.cc:217] @ 0x7eff88a20e0f ZNSt17_Function_handlerIFSt6vectorISt10unique_ptrIN6paddle9framework6OpDescESt14default_deleteIS4_EESaIS7_EERKS4_RKSt13unordered_setISsSt4hashISsESt8equal_toISsESaISsEEPSt13unordered_mapISsSsSE_SG_SaISt4pairIKSsSsEEERKS0_IPNS3_9BlockDescESaIST_EEEZNS3_9LoadOpLibERSN_EUlSB_SK_SR_SX_E_E9_M_invokeERKSt9_Any_dataSB_SK_SR_SX
W0403 15:10:47.330523 3862 init.cc:217] @ 0x7eff88a208bc ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL24pybind11_init_core_noavxERNS_6moduleEEUlRKNS2_9framework6OpDescERKSt13unordered_setISsSt4hashISsESt8equal_toISsESaISsEERKSt6vectorIPNS6_9BlockDescESaISL_EEE68_St4pairISJ_IPS7_SaISS_EESt13unordered_mapISsSsSC_SE_SaISR_IKSsSsEEEEIS9_SI_SP_EINS_4nameENS_5scopeENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNES1H
W0403 15:10:47.332165 3862 init.cc:217] @ 0x7eff88a67d8e pybind11::cpp_function::dispatcher()
W0403 15:10:47.334061 3862 init.cc:217] @ 0x7effc20862f4 PyEval_EvalFrameEx
W0403 15:10:47.336364 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.338436 3862 init.cc:217] @ 0x7effc2084fe8 PyEval_EvalFrameEx
W0403 15:10:47.340397 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.342281 3862 init.cc:217] @ 0x7effc2084fe8 PyEval_EvalFrameEx
W0403 15:10:47.344147 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.346045 3862 init.cc:217] @ 0x7effc2084fe8 PyEval_EvalFrameEx
W0403 15:10:47.347934 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.349687 3862 init.cc:217] @ 0x7effc20107d7 function_call
W0403 15:10:47.351596 3862 init.cc:217] @ 0x7effc1febb83 PyObject_Call
W0403 15:10:47.353756 3862 init.cc:217] @ 0x7effc2080aee PyEval_EvalFrameEx
W0403 15:10:47.355752 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.357581 3862 init.cc:217] @ 0x7effc20107d7 function_call
W0403 15:10:47.359529 3862 init.cc:217] @ 0x7effc1febb83 PyObject_Call
W0403 15:10:47.361403 3862 init.cc:217] @ 0x7effc2080aee PyEval_EvalFrameEx
W0403 15:10:47.363319 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.365209 3862 init.cc:217] @ 0x7effc2084fe8 PyEval_EvalFrameEx
W0403 15:10:47.367069 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.368911 3862 init.cc:217] @ 0x7effc2084fe8 PyEval_EvalFrameEx
W0403 15:10:47.370736 3862 init.cc:217] @ 0x7effc20865ce PyEval_EvalFrameEx
W0403 15:10:47.372582 3862 init.cc:217] @ 0x7effc2087b19 PyEval_EvalCodeEx
W0403 15:10:47.374399 3862 init.cc:217] @ 0x7effc2087d3a PyEval_EvalCode
Illegal instruction (core dumped)
w46czmvw

w46czmvw1#

您本地gcc版本是多少呢,Paddle 1.7.1是pip安装的?pip的包是用gcc 4.8.2编译的,ext_op最好用gcc4.8版本编译,否则可能会有兼容性问题

4si2a6ki

4si2a6ki2#

您本地gcc版本是多少呢,Paddle 1.7.1是pip安装的?pip的包是用gcc 4.8.2编译的,ext_op最好用gcc4.8版本编译,否则可能会有兼容性问题

gcc用的是4.8.4 ,Paddle 1.7.1是pip安装的release版本,编译出pointnet_lib.so文件后按readme里的步骤运行测试文件也没问题,是在运行网络训练时出现的这个报错

kxe2p93d

kxe2p93d3#

数据集或者网络结构有修改什么么

dsekswqp

dsekswqp4#

数据集或者网络结构有修改什么么

传播次数epoch设为了10,其他都没改动。

相关问题