python 在ubuntu 20.04上运行tensorflow时,“无法加载动态库'libcudnn.so.8'"

jmo0nnb3  于 2023-02-07  发布在  Python
关注(0)|答案(3)|浏览(227)

注意:有很多类似的问题,但是针对不同版本的ubuntu和不同的特定库,我还没能弄清楚符号链接和其他环境变量(如LD_LIBRARY_PATH)的组合是什么
下面是我的 nvidia 配置

$ nvidia-smi
Tue Apr  6 11:35:54 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2070    Off  | 00000000:01:00.0 Off |                  N/A |
| 18%   25C    P8     9W / 175W |     25MiB /  7982MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1081      G   /usr/lib/xorg/Xorg                 20MiB |
|    0   N/A  N/A      1465      G   /usr/bin/gnome-shell                3MiB |
+-----------------------------------------------------------------------------+

运行TF程序时,发生以下情况:

2021-04-06 14:35:01.589906: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-04-06 14:35:01.589914: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

有人见过这种特殊的混合吗?你是如何解决它的?
以下是尝试的其他修复之一,但没有任何更改:

conda install cudatoolkit=11.0
tkclm6bt

tkclm6bt1#

所以我也遇到了同样的问题。正如评论所说,这是因为你需要安装CUDNN。对于这一点,有一个指南here
但我已经知道你的发行版(Ubuntu 20.04),我可以给予你的命令行已经:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/${last_public_key}.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update
sudo apt-get install libcudnn8
sudo apt-get install libcudnn8-dev

其中${last_public_key}https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/上发布的最后一个公钥(扩展名为.pub的文件)(在2022年5月9日编辑此帖子时,它是3bf863cc.pub)。
如果要安装特定版本,最后两个命令将替换为

sudo apt-get install libcudnn8=${cudnn_version}-1+${cuda_version}
sudo apt-get install libcudnn8-dev=${cudnn_version}-1+${cuda_version}

其中${cudnn_version}是例如8.2.4.*${cuda_version}是例如cuda11.0(正如我看到的,您在命令nvidia-smi上使用了11.0,尽管我没有测试它,因为我的命令是11.4,但我猜它应该可以正常工作)

sczxawaw

sczxawaw2#

我在Ubuntu 22.04中使用的

sudo apt install nvidia-cudnn
laik7k3q

laik7k3q3#

我遇到了同样的问题,linux操作系统是Centos7-6,因为我没有sudo权限,所以我从anaconda网站安装了cudnn来解决这个问题
在安装最新tensorflow流量计的环境中:

conda install -c anaconda cudnn

您可以使用conda list检查软件包安装:(我之前安装了anaconda的cudatoolkit

Name                      Version              Build  Channel
cudatoolkit               11.3.1               h2bc3f7f_2
cudnn                     8.2.1                cuda11.3_0

你可以检查tensorflow和gpu是否在互相对话:

(tf2_6) [xxxx@co-dept-rd-gpu-01 envs]$ python
Python 3.8.12 (default, Oct 12 2021, 13:49:34)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> gpus = tf.config.experimental.list_physical_devices('GPU')
>>> for gpu in gpus:
...     print("Name:", gpu.name, "  Type:", gpu.device_type)
...
Name: /physical_device:GPU:0   Type: GPU
Name: /physical_device:GPU:1   Type: GPU

相关问题