pytorch 如何在Google Colab上安装nvidia apex

lf5gs5x2  于 2022-11-29  发布在  Go
关注(0)|答案(8)|浏览(315)

我所做的是按照官方github网站上的说明

!git clone https://github.com/NVIDIA/apex
!cd apex
!pip install -v --no-cache-dir ./

它会给我一个错误:

ERROR: Directory './' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
Exception information:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/base_command.py", line 178, in main
    status = self.run(options, args)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/commands/install.py", line 326, in run
    self.name, wheel_cache
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/base_command.py", line 268, in populate_requirement_set
    wheel_cache=wheel_cache
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/req/constructors.py", line 248, in install_req_from_line
    "nor 'pyproject.toml' found." % name
pip._internal.exceptions.InstallationError: Directory './' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
qrjkbowd

qrjkbowd1#

在添加CUDA_HOME环境变量后对我有效:
第一个

8cdiaqws

8cdiaqws2#

(本想加一条评论,但我没有足够的声望...)
它对我很有效,但实际上不需要cd。另外,我需要这里建议的两个全局选项:https://github.com/NVIDIA/apex/issues/86

%%writefile setup.sh

git clone https://github.com/NVIDIA/apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./apex

然后

!sh setup.sh
r1zk6ea1

r1zk6ea13#

已更新

首先,创建一个文件,例如setup.sh,如下所示:
对于具有CUDA和C++扩展的apex:

%%writefile setup.sh

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

然后安装

!sh setup.sh

对于仅Python构建

%%writefile setup.sh

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir ./

仅Python版本省略了使用apex.optimizers.FusedAdamapex.normalization.FusedLayerNorm等所需的某些融合内核。
检查顶点快速启动。

zkure5ic

zkure5ic4#

在colab中,在cd命令前使用“%”而不是“!”

!git clone https://github.com/NVIDIA/apex
%cd apex
!pip install -v --no-cache-dir ./

上面的代码可以正常工作。

dfuffjeb

dfuffjeb5#

我尝试了几个选项,但我喜欢this website中的一个,它与fast_bert和torch一起工作得非常好:

try:
  import apex
except Exception:
  ! git clone https://github.com/NVIDIA/apex.git
  % cd apex
  !pip install --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .
  %cd ..
oymdgrw7

oymdgrw76#

!cd apex有问题。请改用%cd apex
请阅读:https://stackoverflow.com/a/57212513/8690463

k7fdbhmy

k7fdbhmy7#

我使用paperspace,这对我很有效:

!pip install git+https://github.com/NVIDIA/apex
xriantvc

xriantvc8#

2022年11月,以下内容对我有效。
apex.optimizers.FusedAdamapex.normalization.FusedLayerNorm等需要CUDA和C++扩展(例如,请参见here)。因此,仅安装Python构建的版本是不够的。要构建apexPyTorchapex的cuda版本必须匹配,如下所述。
查询运行Ubuntu Colab的版本:

!lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.6 LTS
Release:    18.04
Codename:   bionic

要获取当前cuda版本,请运行:

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

查找最新构建的PyTorch并计算平台here. x1c 0d1x
接下来,转到cuda toolkit archive并配置一个与PyTorch的cuda-version和您的OS版本匹配的版本。

复制安装说明:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

删除Sudo更改最后一行以包含您的cuda-version,例如!apt-get -y install cuda-11-7(如果直接在shell中运行,则不带感叹号):

!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
!apt-get update

您的cuda版本现在将被更新:

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

接下来,更新了Google Colab中过时的Pytorch版本:

!pip install torch -U

构建apex。根据具体情况,您可能需要更少的全局选项:

!git clone https://github.com/NVIDIA/apex.git && cd apex && pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_multihead_attn" . && cd .. && rm -rf apex

...
Successfully installed apex-0.1

现在可以根据需要导入顶点:

from apex import optimizers, normalization
...

相关问题