Paddle 能否完善一下shape和deformable_conv对float16的支持?

2w3kk1z5  于 2022-11-03  发布在  其他
关注(0)|答案(4)|浏览(197)

paddleDetection中,mixed_precision training在遇到 fluid.layers.shape和fluid.layers.deformable_conv时会报错,显示不支持float16。能否完善一下?

  • 版本、环境信息

1)PaddlePaddle版本:1.8.0; PaddleDetection版本:0.3
2)CPU/GPU:cuda10,cudnn7.6
3)系统环境:ubuntu 16.04

0mkxixxg

0mkxixxg1#

好的我们已收到反馈

lkaoscv7

lkaoscv72#

paddleDetection中,mixed_precision training在遇到 fluid.layers.shape和fluid.layers.deformable_conv时会报错,显示不支持float16。能否完善一下?

  • 版本、环境信息

1)PaddlePaddle版本:1.8.0; PaddleDetection版本:0.3
2)CPU/GPU:cuda10,cudnn7.6
3)系统环境:ubuntu 16.04

请详细描述一下问题

r8uurelv

r8uurelv3#

paddleDetection中,mixed_precision training在遇到 fluid.layers.shape和fluid.layers.deformable_conv时会报错,显示不支持float16。能否完善一下?

  • 版本、环境信息

1)PaddlePaddle版本:1.8.0; PaddleDetection版本:0.3
2)CPU/GPU:cuda10,cudnn7.6
3)系统环境:ubuntu 16.04

请详细描述一下问题

我的网络中有如下的代码:
data_shape = fluid.layers.shape(x)
.....
channel_add_term=fluid.layers.resize_nearest(channel_add_term, data_shape[2:], align_corners=True)
时,
不使用混合精度进行训练,一切正常。
但使用混合精度训练时,会报错,报错日志如下:

2020-07-21 15:07:14,674-INFO: loading roidb 2012_train
100%|████████████████████████████████████████████████████████████████████████████████████████| 1366/1366 [00:00<00:00, 4269.22it/s]
2020-07-21 15:07:15,031-INFO: finish loading roidb from scope 2012_train
2020-07-21 15:07:21,275-INFO: finish loading roidbs, total num = 7572
2020-07-21 15:07:21,395-INFO: set max batches to 3028
2020-07-21 15:07:21,395-INFO: Total iters are 3028 under 1 devices
2020-07-21 15:07:21,395-INFO: set (<ppdet.optimizer.PiecewiseDecay object at 0x7f03d82b8e50>).milestones to [11102, 19682]
2020-07-21 15:07:21,395-INFO: set base learning_rate to 0.000125
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'dtype' in cast only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in cast only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'input' in conv2d only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'input' in batch_norm only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in exp only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in elementwise_mul only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'y' in elementwise_mul only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in elementwise_div only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'y' in elementwise_div only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'dtype' in create_parameter only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in elementwise_add only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'y' in elementwise_add only support float16 in GPU now.
(input_name, op_name, extra_message))
2020-07-21 15:07:23,541-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000500] in Optimizer will not take effect, and it will only be applied to other Parameters!
W0721 15:07:24.042296 6562 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 10.1, Runtime API Version: 10.0
W0721 15:07:24.044699 6562 device_context.cc:260] device: 0, cuDNN Version: 7.6.
2020-07-21 15:07:25,867-WARNING: pretrained_models/yolov3_r50vd_dcn_obj365_dropblock_iouloss.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ]
2020-07-21 15:07:28,234-WARNING: variable yolo_output.2.conv.bias not used
2020-07-21 15:07:28,235-WARNING: variable yolo_output.1.conv.bias not used
2020-07-21 15:07:28,235-WARNING: variable yolo_output.1.conv.weights not used
2020-07-21 15:07:28,235-WARNING: variable yolo_output.0.conv.weights not used
2020-07-21 15:07:28,235-WARNING: variable yolo_output.2.conv.weights not used
2020-07-21 15:07:28,235-WARNING: variable yolo_output.0.conv.bias not used
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/io.py:2000: UserWarning: This list is not set, Because of Paramerter not found in program. There are: create_parameter_22.w_0 create_parameter_6.w_0 create_parameter_17.w_0 create_parameter_4.w_0 create_parameter_11.w_0 create_parameter_9.w_0 create_parameter_3.w_0 create_parameter_10.w_0 create_parameter_16.w_0 create_parameter_21.w_0 create_parameter_0.w_0 create_parameter_13.w_0 create_parameter_23.w_0 create_parameter_7.w_0 create_parameter_20.w_0 create_parameter_15.w_0 create_parameter_14.w_0 create_parameter_5.w_0 create_parameter_2.w_0 create_parameter_1.w_0 create_parameter_12.w_0 create_parameter_18.w_0 create_parameter_19.w_0 create_parameter_8.w_0
format(" ".join(unused_para_list)))
2020-07-21 15:07:28,361-INFO: places would be ommited when DataLoader is not iterable
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 424, in
main()
File "tools/train.py", line 295, in main
outs = exe.run(compiled_train_prog, fetch_list=train_values)
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1071, in run
six.reraise(*sys.exc_info())
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1066, in run
return_merged=return_merged)
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl
return_merged=return_merged)
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel
tensors = exe.run(fetch_var_names, return_merged)._move_to_list()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackStringstd::string(std::string&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
2 paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, paddle::platform::Place const&) const
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
5 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
6 paddle::framework::details::ComputationOpHandle::RunImpl()
7 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*)
8 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase*, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*)
9 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&)
10 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
11 ThreadPool::ThreadPool(unsigned long)::{lambda() #1 }::operator()() const

Python Call Stacks (More useful to users):

File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args,kwargs)
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 12053, in shape
type='shape', inputs={'Input': input}, outputs={'Out': out})
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/gc_block.py", line 196, in add_gc_block
data_shape = fluid.layers.shape(x)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 355, in bottleneck
residual = add_gc_block(residual, name=gcb_name,self.gcb_params)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 430, in layer_warp
gcb_name=gcb_name)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 498, in
call

res = self.layer_warp(res, i)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/architectures/yolo.py", line 63, in build
body_feats = self.backbone(im)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/architectures/yolo.py", line 222, in train
return self.build(feed_vars, mode='train')
File "tools/train.py", line 166, in main
train_fetches = model.train(feed_vars)
File "tools/train.py", line 424, in
main()

Error Message Summary:

Error: op shape does not have kernel for data_type[::paddle::platform::float16]:data_layout[ANY_LAYOUT]:place[CUDAPlace(0)]:library_type[PLAIN] at (/paddle/paddle/fluid/framework/operator.cc:1081)
[operator < shape > error]
terminate called without an active exception
W0721 15:07:34.516563 6621 init.cc:216] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0721 15:07:34.516583 6621 init.cc:218] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0721 15:07:34.516592 6621 init.cc:221] The detail failure signal is:

W0721 15:07:34.516600 6621 init.cc:224]Aborted at 1595315254 (unix time) try "date -d @1595315254" if you are using GNU date
W0721 15:07:34.518784 6621 init.cc:224] PC: @ 0x0 (unknown)
W0721 15:07:34.518852 6621 init.cc:224]***SIGABRT (@0x3e8000019a2) received by PID 6562 (TID 0x7f02f27fc700) from PID 6562; stack trace:***
W0721 15:07:34.520735 6621 init.cc:224] @ 0x7f045849a390 (unknown)
W0721 15:07:34.522588 6621 init.cc:224] @ 0x7f04580f4428 gsignal
W0721 15:07:34.524437 6621 init.cc:224] @ 0x7f04580f602a abort
W0721 15:07:34.525629 6621 init.cc:224] @ 0x7f0423d7c84a __gnu_cxx::__verbose_terminate_handler()
W0721 15:07:34.526651 6621 init.cc:224] @ 0x7f0423d7af47 __cxxabiv1::__terminate()
W0721 15:07:34.527976 6621 init.cc:224] @ 0x7f0423d7af7d std::terminate()
W0721 15:07:34.528956 6621 init.cc:224] @ 0x7f0423d7ac5a __gxx_personality_v0
W0721 15:07:34.530412 6621 init.cc:224] @ 0x7f0456f1fb97 _Unwind_ForcedUnwind_Phase2
W0721 15:07:34.531599 6621 init.cc:224] @ 0x7f0456f1fe7d _Unwind_ForcedUnwind
W0721 15:07:34.532696 6621 init.cc:224] @ 0x7f0458499070 __GI___pthread_unwind
W0721 15:07:34.533778 6621 init.cc:224] @ 0x7f0458491845 __pthread_exit
W0721 15:07:34.533972 6621 init.cc:224] @ 0x5601e546e059 PyThread_exit_thread
W0721 15:07:34.534036 6621 init.cc:224] @ 0x5601e52f3c10 PyEval_RestoreThread.cold.799
W0721 15:07:34.534757 6621 init.cc:224] @ 0x7f0445e05269 (unknown)
W0721 15:07:34.534971 6621 init.cc:224] @ 0x5601e53f4ab4 _PyMethodDef_RawFastCallKeywords
W0721 15:07:34.535179 6621 init.cc:224] @ 0x5601e53f4bd1 _PyCFunction_FastCallKeywords
W0721 15:07:34.535387 6621 init.cc:224] @ 0x5601e545b57b _PyEval_EvalFrameDefault
W0721 15:07:34.535576 6621 init.cc:224] @ 0x5601e53a0389 _PyEval_EvalCodeWithName
W0721 15:07:34.535753 6621 init.cc:224] @ 0x5601e53f4255 _PyFunction_FastCallKeywords
W0721 15:07:34.535962 6621 init.cc:224] @ 0x5601e5456d40 _PyEval_EvalFrameDefault
W0721 15:07:34.536151 6621 init.cc:224] @ 0x5601e53a0389 _PyEval_EvalCodeWithName
W0721 15:07:34.536360 6621 init.cc:224] @ 0x5601e53a14c5 _PyFunction_FastCallDict
W0721 15:07:34.536548 6621 init.cc:224] @ 0x5601e53c0a73 _PyObject_Call_Prepend
W0721 15:07:34.536662 6621 init.cc:224] @ 0x5601e540827a slot_tp_call
W0721 15:07:34.536852 6621 init.cc:224] @ 0x5601e54092db _PyObject_FastCallKeywords
W0721 15:07:34.537057 6621 init.cc:224] @ 0x5601e545b146 _PyEval_EvalFrameDefault
W0721 15:07:34.537248 6621 init.cc:224] @ 0x5601e53a13fb _PyFunction_FastCallDict
W0721 15:07:34.537434 6621 init.cc:224] @ 0x5601e53c0a73 _PyObject_Call_Prepend
W0721 15:07:34.537540 6621 init.cc:224] @ 0x5601e540827a slot_tp_call
W0721 15:07:34.537730 6621 init.cc:224] @ 0x5601e54092db _PyObject_FastCallKeywords
W0721 15:07:34.537935 6621 init.cc:224] @ 0x5601e545ba39 _PyEval_EvalFrameDefault
W0721 15:07:34.538122 6621 init.cc:224] @ 0x5601e53a0389 _PyEval_EvalCodeWithName
Aborted (core dumped)

0g0grzrc

0g0grzrc4#

paddleDetection中,mixed_precision training在遇到 fluid.layers.shape和fluid.layers.deformable_conv时会报错,显示不支持float16。能否完善一下?

  • 版本、环境信息

1)PaddlePaddle版本:1.8.0; PaddleDetection版本:0.3
2)CPU/GPU:cuda10,cudnn7.6
3)系统环境:ubuntu 16.04

请详细描述一下问题

当我对含有deformable_conv层的网络使用混合精度训练时,会报如下的错:

2020-07-21 15:09:35,355-INFO: set base learning_rate to 0.000125
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'dtype' in cast only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in cast only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'input' in conv2d only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'input' in batch_norm only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in exp only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in elementwise_mul only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'y' in elementwise_mul only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in elementwise_div only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'y' in elementwise_div only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'dtype' in create_parameter only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in elementwise_add only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'y' in elementwise_add only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'x' in sigmoid only support float16 in GPU now.
(input_name, op_name, extra_message))
/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py:110: UserWarning: The data type of 'input' in deformable_conv only support float16 in GPU now.
(input_name, op_name, extra_message))
Traceback (most recent call last):
File "tools/train.py", line 424, in
main()
File "tools/train.py", line 166, in main
train_fetches = model.train(feed_vars)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/architectures/yolo.py", line 222, in train
return self.build(feed_vars, mode='train')
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/architectures/yolo.py", line 63, in build
body_feats = self.backbone(im)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 498, incall
res = self.layer_warp(res, i)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 430, in layer_warp
gcb_name=gcb_name)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 343, in bottleneck
gcb_name=gcb_name)
File "/home/bwang/projects/sniper-paddle/ppdet/modeling/backbones/resnet.py", line 218, in _conv_norm
name=_name + ".conv2d.output.1")
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 15066, in deformable_conv
'deformable_conv')
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 80, in check_variable_and_dtype
check_dtype(input.dtype, input_name, expected_dtype, op_name, extra_message)
File "/home/bwang/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 115, in check_dtype
extra_message))
TypeError: The data type of 'input' in deformable_conv must be ['float32', 'float64'], but received float16.

相关问题