问题描述 Issue Description
A100机器安装paddle,命令如下:
WITH_GPU=ON
WITH_DISTRIBUTE=ON
WITH_MKL=ON
DWITH_GLOO=ON
cmake .. -DCMAKE_INSTALL_PREFIX=./output/
-DCMAKE_BUILD_TYPE=Release
-DWITH_PYTHON=ON
-DWITH_MKL=$WITH_MKL
-DWITH_GPU=$WITH_GPU
-DCUDA_ARCH_NAME=Auto
-DON_INFER=ON
-DWITH_TESTING=ON
-DWITH_DISTRIBUTE=$WITH_DISTRIBUTE
-DPY_VERSION=3.7
-DWITH_GLOO=$DWITH_GLOO
-DWITH_TENSORRT=ON
-DTENSORRT_ROOT=/usr/local/TensorRT-8.6.1.6/
运行报错:
File "/root/miniconda3/lib/python3.7/site-packages/paddle/nn/layer/conv.py", line 703, in init
data_format=data_format,
File "/root/miniconda3/lib/python3.7/site-packages/paddle/nn/layer/conv.py", line 159, in init
default_initializer=_get_default_param_initializer(),
File "/root/miniconda3/lib/python3.7/site-packages/paddle/nn/layer/layers.py", line 715, in create_parameter
temp_attr, shape, dtype, is_bias, default_initializer
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layer_helper_base.py", line 431, in create_parameter
**attr._to_kwargs(with_initializer=True)
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3949, in create_parameter
initializer(param, self)
File "/root/miniconda3/lib/python3.7/site-packages/paddle/nn/initializer/initializer.py", line 40, in call
return self.forward(param, block)
File "/root/miniconda3/lib/python3.7/site-packages/paddle/nn/initializer/normal.py", line 77, in forward
place,
OSError: (External) CUDA error(222), the provided PTX was compiled with an unsupported toolchain..
[Hint: 'cudaErrorUnsupportedPtxVersion'. This indicates that the provided PTX was compiled with an unsupported toolchain. The most common reason for this, is the PTXwas generated by a compiler newer than what is supported by the CUDA driver and PTX JIT compiler.] (at /root/paddlejob/workspace/env_run/zhangyaxian/Paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:209)
求解决,谢谢。
版本&环境信息 Version & Environment Information
Paddle version: develop
cuda: 11.4
5条答案
按热度按时间628mspwn1#
您好,根据 https://forums.developer.nvidia.com/t/provided-ptx-was-compiled-with-an-unsupported-toolchain-error-using-cub/168292 这个回答,导致这个问题的原因是您环境中的driver版本和nvcc版本不匹配导致的,根据建议您需要升级一下您机器的driver版本。
可以参考这个文档获取CUDA版本和driver的匹配信息 https://docs.nvidia.com/deploy/cuda-compatibility/index.html
pgpifvop2#
我的cuda版本和driver是匹配的。
rm5edbpk3#
@XYZ-916 根据这个错误提示
This indicates that the provided PTX was compiled with an unsupported toolchain. The most common reason for this, is the PTXwas generated by a compiler newer than what is supported by the CUDA driver and PTX JIT compiler.] (at
应该是您编译环境里面的nvcc的版本高于您的driver能支持的版本了,可能是您环境中 nvcc 的版本和cuda11.4不一致导致的,你能检查一下您编译环境里面的nvcc的版本吗?
如果nvcc版本也没有问题,建议可以试试看使用nvidia的官方镜像 nvidia/cuda:11.4.3-cudnn8-devel-ubuntu18.04 编译或者运行来避免环境问题。
ycl3bljg4#
请问这个问题解决了吗,我是用PaddleOCR做推理时抱这个错,这是我的环境信息
wecizke35#
请问这个问题解决了吗,我是用PaddleOCR做推理时抱这个错,这是我的环境信息
i solved my problem!!!
nvidia-smi show as flows ,
Fri Jul 12 19:24:40 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
this means you can install paddlepaddle post112. i install post117 and came across this problem.
https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
wget https://paddle-wheel.bj.bcebos.com/develop/linux/linux-gpu-cuda11.2-cudnn8-mkl-gcc8.2-avx/paddlepaddle_gpu-0.0.0.post112-cp39-cp39-linux_x86_64.whl
python -m pip install paddlepaddle-gpu==0.0.0.post112 -f paddlepaddle_gpu-0.0.0.post112-cp39-cp39-linux_x86_64.whl
and now run check.py
import paddle
paddle.utils.run_check()
output as follows, fixed the problem!
Running verify PaddlePaddle program ...
I0712 19:22:10.413522 124943 program_interpreter.cc:243] New Executor is Running.
W0712 19:22:10.414631 124943 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 11.4, Runtime API Version: 11.7
W0712 19:22:10.415433 124943 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
I0712 19:22:23.808157 124943 interpreter_util.cc:646] Standalone Executor is Used.
PaddlePaddle works well on 1 GPU.