我正在尝试安装支持CUDA的Torch。
下面是我的collect_env.py
脚本的结果:
PyTorch version: 1.7.1+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.9 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: 10.1.243
GPU models and configuration: GPU 0: GeForce GTX 1080
Nvidia driver version: 460.39
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.7.1+cu101
[pip3] torchaudio==0.7.2
[pip3] torchvision==0.8.2+cu101
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.1.243 h6bb024c_0
[conda] mkl 2020.2 256
[conda] mkl-service 2.3.0 py39he8ac12f_0
[conda] mkl_fft 1.3.0 py39h54f3939_0
[conda] mkl_random 1.0.2 py39h63df603_0
[conda] numpy 1.19.2 py39h89c1606_0
[conda] numpy-base 1.19.2 py39h2ae0177_0
[conda] torch 1.7.1+cu101 pypi_0 pypi
[conda] torchaudio 0.7.2 pypi_0 pypi
[conda] torchvision 0.8.2+cu101 pypi_0 pypi
Process finished with exit code 0
下面是nvcc - V
的输出
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
最后,下面是nvidia-smi
的输出
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 On | N/A |
| 0% 52C P0 46W / 180W | 624MiB / 8116MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 873 G /usr/lib/xorg/Xorg 101MiB |
| 0 N/A N/A 1407 G /usr/lib/xorg/Xorg 419MiB |
| 0 N/A N/A 2029 G ...AAAAAAAAA= --shared-files 90MiB |
+-----------------------------------------------------------------------------+
然而,当我试着跑
print(torch.cuda.is_available())
出现以下错误:
UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
我已经执行了重新启动,并按照此处详细说明的安装后步骤进行了操作
2条答案
按热度按时间piztneat1#
您的安装对于CUDA和nvidia驱动程序来说是完美的,但问题是在您的PyTorch和CUDA版本中,您至少需要CUDA 10.2才能安装支持python 3.9的最新版本的Torch
如果你只是使用conda创建一个新的环境,conda会照顾cuda工具包,pip和conda也不能很好地配合:
创建新的conda环境
对于CUDA 11.1:
对于CUDA 10.2:
如果您使用的是pip而不是anaconda环境
请参阅Pytorch Installation Docs / Requirements
最新版本的Torch仅支持CUDA 10.2和11.1
请尝试安装CUDA 10.2或11.1
请尝试升级您PIP并重新安装Torch:
使用以下命令卸载当前安装的Torch版本
升级pip:
安装PyTorch:
qlckcl4x2#
有同样的问题,在我的情况下,解决方案是非常容易的,但它不容易找到它。我不得不删除和插入nvidia_uvm模块。所以:
就在这些命令collect_env.py报告“是否可以使用CUDA:假”.后:“CUDA是否可用:真”