tensorflow 未能加载动态库“libcublas.so.10”;删 debugging 误:libcublas.so.10:无法打开共享目标文件:无此文件或目录;

jchrr9hc  于 2023-03-13  发布在  其他
关注(0)|答案(7)|浏览(493)

当我尝试运行一个python脚本,它使用tensorflow,它显示以下错误...

2020-10-04 16:01:44.994797: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-04 16:01:46.780656: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-04 16:01:46.795642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:03:00.0 name: TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2020-10-04 16:01:46.795699: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-04 16:01:46.795808: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64/:/usr/local/cuda-10.0/lib64
2020-10-04 16:01:46.797391: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-10-04 16:01:46.797707: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-10-04 16:01:46.799529: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-10-04 16:01:46.800524: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-10-04 16:01:46.804150: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-10-04 16:01:46.804169: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

nvidia-smi的输出

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    On   | 00000000:03:00.0 Off |                  N/A |
| 23%   28C    P8     9W / 250W |     18MiB / 12194MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1825      G   /usr/lib/xorg/Xorg                  9MiB |
|    0   N/A  N/A      1957      G   /usr/bin/gnome-shell                6MiB |
+-----------------------------------------------------------------------------+

Tensorflow版本2.3.1,Ubuntu - 18.04
我试图完全删除cuda工具包并从头开始安装,但错误仍然存在。任何人都可以帮助我确定问题的来源?

w80xi6nr

w80xi6nr1#

您必须下载/更新Cuda
如果您正在寻找CUDA工具包10.2下载使用此链接:https://developer.nvidia.com/cuda-10.2-download-archive
然后激活虚拟环境并设置LD_LIBRARY_PATH,例如:无法加载动态库'libcudart.so.10.0(在Ubuntu 18.04上)

fv2wmkja

fv2wmkja2#

**如果您安装了ubuntu 18.04,请运行这些命令。**或按照here的说明操作

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin

sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub

sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"

sudo apt-get update

sudo apt-get -y install cuda
tjjdgumg

tjjdgumg3#

这对我很有效:
sudo apt-get安装库10.1

m4pnthwp

m4pnthwp4#

确保您安装了兼容GPU/CPU版本的tensorflow。我在Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 2.59 GHz计算机上的Pipenv虚拟环境中安装tensorflow。我使用pip install tensorflow时收到了相同的消息。以下是执行笔记本单元或包含import tensorflow as tf的代码后的输出消息。

2023-02-27 14:53:16.590721: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-27 14:53:16.957000: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-02-27 14:53:16.957019: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-02-27 14:53:18.213523: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-02-27 14:53:18.213591: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-02-27 14:53:18.213597: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

当我将安装命令更改为pip install tensorflow-cpu时,错误消失了。这是执行相同的笔记本单元格或包含import tensorflow as tf的代码后的新输出消息。

023-02-28 10:01:35.003241: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

pip install tensorflow-gpu可以解决这个问题。请参见here了解有关tensorflow-gpu的详细信息。但是,请仔细检查这个官方pip库,似乎从2022年12月起,建议安装tensorflow而不是tensorflow-gpu。

z31licg0

z31licg05#

在Ubuntu 20.04上,您可以简单地安装NVIDIAs cuda toolkitcuda

sudo apt-get update
sudo apt install nvidia-cuda-toolkit

还有install advices for Windows
该软件包大约是1GB,它花了一段时间来安装...几分钟后,您需要export PATH变量,以便可以找到它:
1.查找共享对象

sudo find / -name 'libcudart.so*'

/usr/lib/x86_64-linux-gnu/libcudart.so.10.1
/usr/lib/x86_64-linux-gnu/libcudart.so

1.将文件夹添加到path,以便python能够找到它

export PATH=/usr/lib/x86_64-linux-gnu${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

1.权限

sudo chmod a+r /usr/lib/x86_64-linux-gnu/libcuda*

Helped me

fivyi3re

fivyi3re6#

这通常发生在使用不兼容的CUDA版本运行Tensorflow时。看起来以前有人问过这个问题(无法评论)。请参考this问题。

lztngnrs

lztngnrs7#

今天我遇到了这个问题。我去了CUDA toolkit website,选择了选项,然后显示了一些如下的说明:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda-repo-ubuntu2004-11-6-local_11.6.2-510.47.03-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-11-6-local_11.6.2-510.47.03-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu2004-11-6-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda         # I have broken packages, so could not invoke this command

因此,说明将根据您的规格而变化,请勿从此处/其他堆栈溢出答案复制。

我无法调用最后一个命令,但经过一些试验和错误之后,我调用了:

sudo apt install libcudart.so.11.0   # this worked for me!

这对我很有效!

相关问题