Paddle Ubutun 安装gpu版本无异常,但是paddle.fluid.install_check.run_check()报错

rkkpypqq  于 2022-10-20  发布在  其他
关注(0)|答案(2)|浏览(160)
  • 版本、环境信息:

   1)PaddlePaddle版本:paddlepaddle-gpu1.4.1.post97(只安装了这个gpu版本)
   3)GPU:nvidia 418
cuda9.0
cudnn7.3(由于安装7.1时代码提示The installed Paddle is compiled with CUDNN 7.3, but CUDNN version in your machine ,所以最新安装为7.3)
   4)系统环境:ubuntu18.04,python3.6

  • 安装方式信息:

1)pip安装,安装python3.6的虚拟环境中      

  • 复现信息:如为报错,请给出复现环境、复现步骤

在运行内核为2G的单gpu笔记本上,按照以上配置出现下面的问题,在运行程序的时候gpu使用情况为:305MiB / 2004MiB
在运行内核为4核12G的台机上,按照以上配置出现下面的问题。

  • 问题描述:请详细描述您的问题,同步贴出报错信息、日志/代码关键片段

W0618 14:37:08.514142 20636 device_context.cc:261] Please NOTE: device: 0, CUDA Capability: 50, Driver API Version: 10.1, Runtime API Version: 9.0
W0618 14:37:08.517132 20636 device_context.cc:269] device: 0, cuDNN Version: 7.0.
Traceback (most recent call last):
File "/home/zz/program/MNIST-paddle/line-paddle.py", line 58, in
exe.run(startup_program)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 565, in run
use_program_cache=use_program_cache)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 642, in run
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core.EnforceNotMet: Invoke operator fill_constant error.
Python Callstacks:
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 1725, in prepend_op
attrs=kwargs.get("attrs", None))
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/initializer.py", line 167, incall
stop_gradient=True)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 1517, in create_var
kwargs['initializer'](var, self)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/layer_helper_base.py", line 382, in set_variable_initializer
initializer=initializer)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/layers/tensor.py", line 152, in create_global_var
value=float(value), force_cpu=force_cpu))
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/optimizer.py", line 136, in create_global_learning_rate
persistable=True)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/optimizer.py", line 275, in create_optimization_pass
self.create_global_learning_rate()
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/optimizer.py", line 441, in apply_gradients
optimize_ops = self.create_optimization_pass(params_grads)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/optimizer.py", line 469, in apply_optimize
optimize_ops = self.apply_gradients(params_grads)
File "/home/zz/env_python3/lib/python3.6/site-packages/paddle/fluid/optimizer.py", line 500, in minimize
loss, startup_program=startup_program, params_grads=params_grads)
File "/home/zz/program/MNIST-paddle/line-paddle.py", line 29, in
sgd_optimizer.minimize(avg_loss)
C++ Callstacks:
Enforce failed. Expected allocating <= available, but received allocating:1837034932 > available:1373896448.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:262]
PaddlePaddle Call Stacks:
0 0x7fd967365228p void paddle::platform::EnforceNotMet::Initstd::string(std::string, char const
, int) + 360
1 0x7fd967365577p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const
, int) + 87
2 0x7fd9690ad706p paddle::platform::GpuMaxChunkSize() + 630
3 0x7fd9690820a2p
4 0x7fd99e6f4827p
5 0x7fd96908174dp paddle::memory::legacy::GetGPUBuddyAllocator(int) + 109
6 0x7fd969082573p void
paddle::memory::legacy::Allocpaddle::platform::CUDAPlace(paddle::platform::CUDAPlace const&, unsigned long) + 35
7 0x7fd9690829b5p paddle::memory::allocation::LegacyAllocator::AllocateImpl(unsigned long, paddle::memory::allocation::Allocator::Attr) + 389
8 0x7fd9690a7c6bp paddle::memory::allocation::Allocator::Allocate(unsigned long, paddle::memory::allocation::Allocator::Attr) + 27
9 0x7fd9690765b3p paddle::memory::allocation::AllocatorFacade::Alloc(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void*, boost::detail::variant::void*, boost::detail::variant::void*, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long, paddle::memory::allocation::Allocator::Attr) + 435
10 0x7fd9690766d1p paddle::memory::allocation::AllocatorFacade::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long, paddle::memory::allocation::Allocator::Attr) + 33
11 0x7fd968cb45a0p paddle::memory::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long, paddle::memory::allocation::Allocator::Attr) + 48
12 0x7fd96904896ap paddle::framework::Tensor::mutable_data(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, paddle::framework::proto::VarType_Type, paddle::memory::allocation::Allocator::Attr, unsigned long) + 154
13 0x7fd9680210d1p paddle::operators::FillConstantKernel::Compute(paddle::framework::ExecutionContext const&) const + 497
14 0x7fd968024273p std::Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::FillConstantKernel, paddle::operators::FillConstantKernel, paddle::operators::FillConstantKernel, paddle::operators::FillConstantKernel, paddle::operators::FillConstantKernelpaddle::platform::float16 >::operator()(char const, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)read data from hdfs #1}>::*M_invoke(std::Any_data const&, paddle::framework::ExecutionContext const&) + 35
15 0x7fd968ff4376p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void
, boost::detail::variant::void*, boost::detail::variant::void*, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 662
16 0x7fd968ff4ae4p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 292
17 0x7fd968ff240cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332
18 0x7fd9674d73fep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382
19 0x7fd9674d823fp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool) + 143
20 0x7fd9673548dep
21 0x7fd9673977cep
22 0x565d5cp _PyCFunction_FastCallDict + 860
23 0x503073p
24 0x506859p _PyEval_EvalFrameDefault + 1097
25 0x504c28p
26 0x502540p
27 0x502f3dp
28 0x507641p _PyEval_EvalFrameDefault + 4657
29 0x504c28p
30 0x502540p
31 0x502f3dp
32 0x506859p _PyEval_EvalFrameDefault + 1097
33 0x504c28p
34 0x506393p PyEval_EvalCode + 35
35 0x634d52p
36 0x634e0ap PyRun_FileExFlags + 154
37 0x6385c8p PyRun_SimpleFileExFlags + 392
38 0x63915ap Py_Main + 1402
39 0x4a6f10p main + 224
40 0x7fd99e925b97p __libc_start_main + 231
41 0x5afa0ap _start + 42

fcy6dtqo

fcy6dtqo2#

在确认是显存问题之前,
我使用的ubuntu18.04,python3.5,cuda9+cudnn7,我用的是笔记本,单GPU+2G运行内存。
无论如何都运行不起来!!!!!
然后我用了同样配置的台机,4核12G。
当我gpu有调用的时候,paddlepaddle的gpu代码同样无法运行。
我使用的是paddlepaddle单gpu进程,调用gpu0,然后我本机有一个运行中的进程调用的gpu是3,然后报了上面的错误!!!!!!!
只有在我终止所有的进程的时候,paddlepaddle的gpu代码开可以运行。

相关问题