Paddle paddle-trt预测faster-rcnn模型加载报错

5cg8jx4n  于 2022-10-20  发布在  其他
关注(0)|答案(4)|浏览(1050)
  • 标题:paddle-trt预测faster-rcnn模型加载报错
  • 版本、环境信息:

   1)PaddlePaddle版本:v1.8.5
   2)GPU:Tesla P4, CUDA10, cuDNN7.6, TensorRT7.0.0.11
   3)系统环境:Ubuntu16.04

  • 预测信息

   1)C++预测:编译的paddle-trt预测库,编译参数:

cmake .. \
      -DWITH_MKL=OFF \
      -DWITH_MKLDNN=OFF \
      -DCMAKE_BUILD_TYPE=Release \
      -DWITH_PYTHON=OFF   \
      -DWITH_XBYAK=ON \
      -DTENSORRT_ROOT=/opt/TensorRT-7.0.0.11 \
      -DON_INFER=ON \
      -DFLUID_INFERENCE_INSTALL_DIR=/ljay/workspace/proj/ljay-cuda10/Paddle/install
  • 问题描述:编译的预测库,使用AnalysisConfig,加载faster-rcnn模型报dims的错误如下:
I1009 12:20:44.080718  3812 graph_pattern_detector.cc:101] ---  detected 55 subgraphs
--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [fc_fuse_pass]
I1009 12:20:44.197888  3812 graph_pattern_detector.cc:101] ---  detected 2 subgraphs
I1009 12:20:44.199591  3812 graph_pattern_detector.cc:101] ---  detected 2 subgraphs
--- Running IR pass [tensorrt_subgraph_pass]
I1009 12:20:44.252492  3812 tensorrt_subgraph_pass.cc:115] ---  detect a sub-graph with 13 nodes
W1009 12:20:44.253955  3812 tensorrt_subgraph_pass.cc:285] The Paddle lib links the 7011 version TensorRT, make sure the runtime TensorRT you are using is no less than this version, otherwise, there might be Segfault!
I1009 12:20:44.254004  3812 tensorrt_subgraph_pass.cc:321] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<std::string >(std::string&&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2   paddle::inference::tensorrt::OpConverter::ConvertBlockToTRTEngine(paddle::framework::BlockDesc*, paddle::framework::Scope const&, std::vector<std::string, std::allocator<std::string > > const&, std::unordered_set<std::string, std::hash<std::string >, std::equal_to<std::string >, std::allocator<std::string > > const&, std::vector<std::string, std::allocator<std::string > > const&, paddle::inference::tensorrt::TensorRTEngine*)
3   paddle::inference::analysis::TensorRtSubgraphPass::CreateTensorRTOp(paddle::framework::ir::Node*, paddle::framework::ir::Graph*, std::vector<std::string, std::allocator<std::string > > const&, std::vector<std::string, std::allocator<std::string > >*) const
4   paddle::inference::analysis::TensorRtSubgraphPass::ApplyImpl(paddle::framework::ir::Graph*) const
5   paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph*) const
6   paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_delete<paddle::framework::ir::Graph> >)
7   paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument*)
8   paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*)
9   paddle::AnalysisPredictor::OptimizeInferenceProgram()
10  paddle::AnalysisPredictor::PrepareProgram(std::shared_ptr<paddle::framework::ProgramDesc> const&)
11  paddle::AnalysisPredictor::Init(std::shared_ptr<paddle::framework::Scope> const&, std::shared_ptr<paddle::framework::ProgramDesc> const&)
12  std::unique_ptr<paddle::PaddlePredictor, std::default_delete<paddle::PaddlePredictor> > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&)
13  std::unique_ptr<paddle::PaddlePredictor, std::default_delete<paddle::PaddlePredictor> > paddle::CreatePaddlePredictor<paddle::AnalysisConfig>(paddle::AnalysisConfig const&)

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: TensorRT's tensor input requires at least 2 dimensions, but input fpn_topdown_res4_sum.tmp_0 has 1 dims.
  [Hint: Expected shape.size() > 1UL, but received shape.size():1 <= 1UL:1.] at (/ljay/workspace/proj/ljay-cuda10-paddletrt/Paddle/paddle/fluid/inference/tensorrt/engine.h:67)

Aborted (core dumped)

config配置如下:

paddle::AnalysisConfig config;
    std::map<std::string, std::vector<int>> min_input_shape = {{"image", {1, 3, 1, 1}}};
    std::map<std::string, std::vector<int>> max_input_shape = {{"image", {1, 3, 1312, 1312}}};
    std::map<std::string, std::vector<int>> opt_input_shape = {{"image", {1, 3, 960, 960}}};
    config.SetModelBuffer(
        prog_file_cont.data(), prog_file_cont.length(),
        params_file_cont.data(), params_file_cont.length()
    );
    config.EnableUseGpu(1000, _config.device);
    config.EnableTensorRtEngine(20 << 20,
            1,
            3,
            paddle::AnalysisConfig::Precision::kFloat32,
            false,
            false);
    config.SetTRTDynamicShapeInfo(min_input_shape, max_input_shape, opt_input_shape);

请问:

1. faster-rcnn是否已经支持?
2. 如上报错是什么问题,应该怎么解决?

谢谢~

ajsxfq5m

ajsxfq5m1#

完整log:
[抱歉之前贴的是Retinanet的log,根据建议将最小子图节点数改为32后运行成功]

但将faster-rcnn的最小子图节点数改为32后报了另一个错误,日志如下:

I1009 12:25:35.478006  3929 graph_pattern_detector.cc:101] ---  detected 55 subgraphs
--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [fc_fuse_pass]
I1009 12:25:35.595186  3929 graph_pattern_detector.cc:101] ---  detected 2 subgraphs
I1009 12:25:35.596897  3929 graph_pattern_detector.cc:101] ---  detected 2 subgraphs
--- Running IR pass [tensorrt_subgraph_pass]
I1009 12:25:35.647811  3929 tensorrt_subgraph_pass.cc:115] ---  detect a sub-graph with 206 nodes
W1009 12:25:35.664347  3929 tensorrt_subgraph_pass.cc:285] The Paddle lib links the 7011 version TensorRT, make sure the runtime TensorRT you are using is no less than this version, otherwise, there might be Segfault!
I1009 12:25:35.664405  3929 tensorrt_subgraph_pass.cc:321] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
I1009 12:25:36.129441  3929 engine.cc:176] Run Paddle-TRT Dynamic Shape mode.
E1009 12:26:33.447953  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.447997  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.448009  3929 helper.h:76] Instruction: CHECK_BROADCAST 84 83
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
I1009 12:26:33.505208  3929 graph_pattern_detector.cc:101] ---  detected 3 subgraphs
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
I1009 12:26:33.508797  3929 graph_pattern_detector.cc:101] ---  detected 10 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I1009 12:26:33.520928  3929 ir_params_sync_among_devices_pass.cc:41] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [ir_graph_to_program_pass]
I1009 12:26:33.658653  3929 analysis_predictor.cc:496] ======= optimize end =======
W1009 12:26:33.846643  3929 device_context.cc:252] Please NOTE: device: 2, CUDA Capability: 61, Driver API Version: 10.2, Runtime API Version: 10.0
W1009 12:26:33.847007  3929 device_context.cc:260] device: 2, cuDNN Version: 7.6.
E1009 12:26:33.858134  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858175  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858217  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858287  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858307  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858350  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858400  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858429  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858449  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858475  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858489  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858511  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858551  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858568  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858592  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858621  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858639  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858664  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858695  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858713  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858737  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858781  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858798  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858824  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858853  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858871  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858909  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.858937  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.858959  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.858976  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859011  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859030  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859061  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859087  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859105  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859129  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859156  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859184  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859205  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859232  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859251  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859270  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859300  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859318  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859349  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859385  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859402  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859432  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
E1009 12:26:33.859468  3929 helper.h:76] elementwise (Output: res4a.add.output.5290): dimensions not compatible for elementwise
E1009 12:26:33.859486  3929 helper.h:76] shapeMachine.cpp (252) - Shape Error in operator(): broadcast iwth incompatible dimensions
E1009 12:26:33.859508  3929 helper.h:76] Instruction: CHECK_BROADCAST 52 51
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2   paddle::operators::InterpolateOp::InferShape(paddle::framework::InferShapeContext*) const
3   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
5   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
6   paddle::framework::NaiveExecutor::Run()
7   paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> > const&, std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> >*, int)

----------------------
Error Message Summary:
----------------------
Error: Input(X) dimension must be 4 or 5 at (/ljay/workspace/proj/ljay-cuda10-paddletrt/Paddle/paddle/fluid/operators/interpolate_op.cc:194)
  [operator < nearest_interp > error]
Aborted (core dumped)
c3frrgcw

c3frrgcw2#

请问解决了吗,『最小子图节点数改为32』 怎么实现的呢

w9apscun

w9apscun3#

void EnableTensorRtEngine(int workspace_size = 1 << 20,
int max_batch_size = 1, int min_subgraph_size = 3,
Precision precision = Precision::kFloat32,
bool use_static = false,
bool use_calib_mode = true);

设置min_subgraph_size=32

o7jaxewo

o7jaxewo4#

请问解决了吗,『最小子图节点数改为32』 怎么实现的呢

@shikeno 我用的1.8.5版本的paddle,据说需要用2.0-beta,才能解决 Instruction: CHECK_BROADCAST 52 51 的问题,还没验证。

相关问题