我已经尝试在我的gpu上运行tensorflow好几天了,但是我一直没能完成它。
我知道有几个问题与类似的问题,但我已经尝试了一切我发现,它没有工作,所以这就是为什么我写这个问题:
How to install libcusolver.so.11
https://stackoverflow.com/a/67642774/15098668
我已经为Nvidia GeForce RTX 3090安装了驱动程序460.106.00和cuda 11.2:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00 Driver Version: 460.106.00 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3090 On | 00000000:08:00.0 On | N/A |
| 33% 26C P8 22W / 350W | 282MiB / 24260MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1264 G /usr/lib/xorg/Xorg 59MiB |
| 0 N/A N/A 3349 G /usr/lib/xorg/Xorg 124MiB |
| 0 N/A N/A 3508 G /usr/bin/gnome-shell 77MiB |
| 0 N/A N/A 6384 G /usr/lib/firefox/firefox 4MiB |
+-----------------------------------------------------------------------------+
客户:
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 1
GCC编译器:
gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
我还将LD_LIRARY_PATH添加到./bashrc中
# Nvidia cuda toolkit
export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda
我尝试了几个tensorflow和tensorflow-gpu版本,从2.4到2.7,但每个版本都失败了:
2022-01-24 21:28:43.206834: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
或
2022-01-24 21:28:44.087779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087827: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087858: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087891: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087921: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087947: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087975: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
提前感谢,我不知道还能尝试什么...
2条答案
按热度按时间w1jd8yoj1#
在尝试了很多东西之后,我创建了一个新的conda环境并安装了tensorflow-gpu,因为我不关心TF版本:
它安装了以下所有软件包:
包括cudatoolkit和cudnn...
在那之后,我不知道为什么,TF检测到了nvidia卡:
wswtfjt72#
确保遵循tensorflow软件兼容性:https://www.tensorflow.org/install/source#gpu
更多详情请点击此处:https://stackoverflow.com/a/50622526
我在使用时遇到此问题
通过将python和tensorflow分别降级到3.6和2.4.0来解决这个问题。因此,满足了tensorflow兼容性。