Paddle FatalError: Segmentation fault

sd2nnvve  于 2021-11-30  发布在  Java
关注(0)|答案(26)|浏览(1213)
eval model::   3% 10/300 [00:08<04:12,  1.15it/s]

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, std::map<std::string, std::string, std::less<std::string >, std::allocator<std::pair<std::string const, std::string > > > const&)
1   paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool, std::map<std::string, std::string, std::less<std::string >, std::allocator<std::pair<std::string const, std::string > > > const&)
2   paddle::imperative::PreparedOp::Run(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap const&, paddle::framework::AttributeMap const&)
3   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::CUDNNConvOpKernel<float>, paddle::operators::CUDNNConvOpKernel<double>, paddle::operators::CUDNNConvOpKernel<paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
4   paddle::operators::CUDNNConvOpKernel<float>::Compute(paddle::framework::ExecutionContext const&) const
5   paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long)
6   paddle::memory::AllocShared(paddle::platform::Place const&, unsigned long)
7   paddle::memory::allocation::AllocatorFacade::AllocShared(paddle::platform::Place const&, unsigned long)
8   paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long)
9   paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long)
10  paddle::memory::allocation::AutoGrowthBestFitAllocator::FreeIdleChunks()
----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo:***Aborted at 1636257571 (unix time) try "date -d @1636257571" if you are using GNU date***]
  [SignalInfo:***SIGSEGV (@0x28) received by PID 960 (TID 0x7f26d386d780) from PID 40***]

I don't know where the problem is, and I searched a lot of solutions above, but they couldn't solve it. Can you help me take a look?

pbwdgjma

pbwdgjma1#

Thank you so much for your help! Nice to meet you!

6ojccjat

6ojccjat2#

@dang-nh194423

There are no more debug info,I have no idea to find the real reason.

gc0ot86w

gc0ot86w3#

So, there is no solution for this error 😢

0ejtzxu1

0ejtzxu14#

@dang-nh194423

Segmentation fault

maybe it is an illegal attempt to access not initialized tensor

gpfsuwkq

gpfsuwkq5#

@GuoxiaWang
I use Google Colab so I don't push it to github.
Because I want to train the pretrained model. But I found only one page on how to do it. Did you need it, I will share to you

xiozqbni

xiozqbni6#

@dang-nh194423

What's Repo?

tuwxkamq

tuwxkamq7#

@dang-nh194423


# evaluation script

import os
os.environ['GLOG_v']="3"
os.environ['FLAGS_call_stack_level']="2"

# Please set environment variable where you run the paddle code

# GLOG_v means VLOG level

# FLAGS_call_stack_level means C++ call stack

# os.environ['GLOG_v']="3" will print C++ VLOG(3) info

# os.environ['FLAGS_call_stack_level']="2" will print C++ call stack
goucqfw6

goucqfw68#

@GuoxiaWang
I only use 1 line below 😊

!python3 tools/eval.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.checkpoints=./output/ch_db_res18_2/latest
q1qsirdb

q1qsirdb9#

@dang-nh194423

Can you paste your code ?

yptwkmov

yptwkmov10#

@GuoxiaWang
Yes, thank you. But what should I do now?

bvuwiixz

bvuwiixz11#

@dang-nh194423

indicates the error is happened when GPU memory alloc in Conv layer

34gzjxbg

34gzjxbg12#

@GuoxiaWang
I think it is not error, because when I evaluate with another model (Detection by MobileNet), no error happened. But when I use ResNet18, this error happened 

beq87vna

beq87vna13#

@dang-nh194423

What error it is?

i1icjdpr

i1icjdpr14#

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

lqfhib0f

lqfhib0f15#

I can't find the log file of evaluation. This folder has only train.log, I read train.log and I think the eval log file will be the same.
Because I got this error when I run evaluation the model

相关问题