我找不到一个解决方案来管理如何在Windows上使用标准MSVC 2019编译器的CMake项目中使用CUDA语言。
我正在尝试配置和编译this hello-cmake-cuda
repository(也在this blog post中描述)。CMakeLists.txt
文件内容:
cmake_minimum_required(VERSION 3.8 FATAL_ERROR)
project(hello LANGUAGES CXX CUDA)
enable_language(CUDA)
add_executable(hello hello.cu)
下面是从build目录中运行的cmake ..
命令的输出:
PS C:\GitRepo\cuda_hello\build> cmake ..
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:311 (message):
CMAKE_CUDA_ARCHITECTURES must be valid if set.
Call Stack (most recent call first):
CMakeLists.txt:5 (project)
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".
这意味着CMakeDetermineCUDACompiler.cmake:311
中的architectures_tested
是空的...
如何让CMake完成它的配置和要构建的简单程序?
我的开发环境
- 操作系统:Windows 11版本10.0.22000内部版本号22000
- 编译器:Microsoft Visual Studio社区2019版本16.11.11
- CMake版本为3.23
- CUDA版本为11.6
我已经尝试了不同版本的每一个软,并不断有相同的问题。我已经决定留在这些版本的时刻。
我的GPU已正确配置:它显示为nvidia-smi
,我还能够构建和运行deviceQuery
CUDA示例:
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA GeForce GTX 1650"
CUDA Driver Version / Runtime Version 11.6 / 11.6
CUDA Capability Major/Minor version number: 7.5
etc. etc. ...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1
Result = PASS
我的环境PATH变量:
PS C:\GitRepo\hello-cuda-cmake-master> $env:path -split ";"
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\libnvvp
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\libnvvp
C:\Program Files (x86)\Common Files\Oracle\Java\javapath
C:\Python38\Scripts\
C:\Python38\
C:\Windows\system32
C:\Windows
C:\Windows\System32\Wbem
C:\Windows\System32\WindowsPowerShell\v1.0\
C:\Windows\System32\OpenSSH\
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR
C:\Program Files\PuTTY\
C:\Program Files (x86)\PuTTY\
C:\Program Files\Microsoft SQL Server\110\Tools\Binn\
C:\Program Files\TortoiseSVN\bin
C:\Program Files\TortoiseGit\bin
C:\Program Files\Microsoft VS Code\bin
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\System32\WindowsPowerShell\v1.0\
C:\WINDOWS\System32\OpenSSH\
C:\Program Files\Docker\Docker\resources\bin
C:\ProgramData\DockerDesktop\version-bin
C:\Program Files\Git\cmd
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\System32\WindowsPowerShell\v1.0\
C:\WINDOWS\System32\OpenSSH\
C:\Program Files\NVIDIA Corporation\Nsight Compute 2022.1.1\
C:\Program Files\CMake\bin
C:\Ruby30-x64\bin
C:\Users\Thibault GEFFROY\.cargo\bin
C:\Users\Thibault GEFFROY\AppData\Local\Microsoft\WindowsApps
C:\Program Files\OpenCppCoverage
C:\intelFPGA\20.1\modelsim_ase\win32aloem
我尝试过但没有成功的方法
如果我尝试插入所需的CMAKE_CUDA_ARCHITECTURES
:set(CMAKE_CUDA_ARCHITECTURES 75)
我得到:
PS C:\GitRepo\cuda_hello\build> cmake ..
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
-- The CUDA compiler identification is unknown
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:654 (message):
The CMAKE_CUDA_ARCHITECTURES:
75
do not all work with this compiler. Try:
instead.
Call Stack (most recent call first):
CMakeLists.txt:5 (project)
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".
如果我尝试使用FindCUDA
模块来设置CMAKE_CUDA_ARCHITECTURES
--由@alfC here给出的解--我会得到:
PS C:\GitRepo\cuda_hello\build> cmake ..
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA/select_compute_arch.cmake:120 (file):
file failed to open for writing (Permission denied):
/detect_cuda_compute_capabilities.cpp
Call Stack (most recent call first):
CMakeLists.txt:4 (CUDA_DETECT_INSTALLED_GPUS)
CMake Error: The source directory "CMAKE_FLAGS" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA/select_compute_arch.cmake:141 (try_run):
Failed to configure test project build system.
Call Stack (most recent call first):
CMakeLists.txt:4 (CUDA_DETECT_INSTALLED_GPUS)
CMake Error: TRY_COMPILE attempt to remove -rf directory that does not contain CMakeTmp:/detect_cuda_compute_capabilities.cpp
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".
最后,如果我尝试调用find_package(CUDA)
,将得到:
PS C:\GitRepo\cuda_hello\build> cmake ..
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA.cmake:677 (cmake_initialize_per_config_variable):
Unknown CMake command "cmake_initialize_per_config_variable".
Call Stack (most recent call first):
CMakeLists.txt:2 (find_package)
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".
编辑1:
对@einpoklum解决方案this的回答:
谢谢你的建议,但它也行不通。
以下是your repository中cmake -B build
命令的输出:
PS C:\GitRepo\hello-cuda-cmake-master> cmake -B build
-- Building for: Visual Studio 16 2019
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
-- The CUDA compiler identification is unknown
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):
Failed to detect a default CUDA architecture.
Compiler output:
Call Stack (most recent call first):
CMakeLists.txt:2 (project)
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeError.log".
使用PowerShell或MSVC命令提示符时的输出相同。
以下是使用cmake-gui时的cmake变量及其值:
使用简单的nvcc build命令时:nvcc hello.cu
在MSVC命令提示符下,我得到:
nvcc fatal : Could not set up the environment for Microsoft Visual Studio using 'c:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat'
不过PATH是有效的,而且指令码vcvars64.bat存在于此位置。
如果将find_package(CUDAToolkit)
添加到CMakeLists.txt
,会发生什么情况
新的CMakeLists.txt
:
cmake_minimum_required(VERSION 3.18 FATAL_ERROR)
find_package(CUDAToolkit)
project(hello LANGUAGES CUDA)
add_executable(hello hello.cu)
输出:
PS C:\GitRepo\hello-cuda-cmake-master> cmake -B build
-- Building for: Visual Studio 16 2019
-- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/include (found version "11.6.124")
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
-- The CUDA compiler identification is unknown
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):
Failed to detect a default CUDA architecture.
Compiler output:
Call Stack (most recent call first):
CMakeLists.txt:3 (project)
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeError.log".
编辑2:
我正在尝试编译CUDA sample BlackScholes而不使用CMake,使用提供的MSVC 2019解决方案。
最后出现以下错误:
Severity Code Description Project File Line Suppression State
Error MSB3721 The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" -gencode=arch=compute_37,code=\"sm_37,compute_37\" -gencode=arch=compute_50,code=\"sm_50,compute_50\" -gencode=arch=compute_52,code=\"sm_52,compute_52\" -gencode=arch=compute_60,code=\"sm_60,compute_60\" -gencode=arch=compute_61,code=\"sm_61,compute_61\" -gencode=arch=compute_70,code=\"sm_70,compute_70\" -gencode=arch=compute_75,code=\"sm_75,compute_75\" -gencode=arch=compute_80,code=\"sm_80,compute_80\" -gencode=arch=compute_86,code=\"sm_86,compute_86\" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64" -x cu -I./ -I../../../Common -I./ -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\/include" -I../../../Common -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -cudart static -Xcompiler "/wd 4819" --threads 0 -g -DWIN32 -DWIN32 -D_MBCS -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64/Debug/vc142.pdb /FS /Zi /RTC1 /MTd " -o "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.6\cuda-samples\Samples\5_Domain_Specific\BlackScholes\x64\Debug\BlackScholes.cu.obj" "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.6\cuda-samples\Samples\5_Domain_Specific\BlackScholes\BlackScholes.cu"" exited with code 1. BlackScholes C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\BuildCustomizations\CUDA 11.6.targets 790
在使用WSL 2 Ubuntu 20.4和以下CUDA安装以及这些说明构建BlackScholes示例时,我得到以下输出:
$ sudo make BlackScholes
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common -m64 -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes.o -c BlackScholes.cu
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
ptxas warning : For profile sm_86 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_80 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_70 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_75 adjusting per thread register count of 16 to lower bound of 24
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common -m64 -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes_gold.o -c BlackScholes_gold.cpp
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes BlackScholes.o BlackScholes_gold.o
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
mkdir -p ../../../bin/x86_64/linux/release
cp BlackScholes ../../../bin/x86_64/linux/release
$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Turing" with compute capability 7.5
Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.
Executing Black-Scholes GPU kernel (512 iterations)...
Options count : 8000000
BlackScholesGPU() time : 0.722482 msec
Effective memory bandwidth: 110.729334 GB/s
Gigaoptions per second : 11.072933
BlackScholes, Throughput = 11.0729 GOptions/s, Time = 0.00072 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128
Reading back GPU results...
Checking the results...
...running CPU calculations.
Comparing the results...
L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05
Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.
[BlackScholes] - Test Summary
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Test passed
4条答案
按热度按时间b0zn9rqh1#
从CMake 3.18开始,我们不再直接或通过
find_package(CUDA)
使用FindCUDA.cmake模块,它已被find_package(CUDAToolkit)
(使用FindCUDAToolkit.cmake
模块)所取代。但实际上,对于你的简单hello-world项目--你甚至不需要这样做,因为从CMake 3.8开始,CUDA就成为了CMake的“一等公民”语言。嗯,算是吧。所以,这里有一个
CMakeLists.txt
文件,你可以用途:我已经使用CUDA 11.6和Visual Studio 16(又名VS 2019)在Windows 10(企业评估版)虚拟机上进行了测试。
注意:
cmake_minimum_required()
行中的版本号可能是 critical!使用cuda_hello
存储库中的版本号-它对我不起作用,因为CMAKE_CUDA_ARCHITECTURES
值是 * 必需 * 存在的。现在,在使用CMake进行配置之后,您可以运行
ccmake
,在那里您将看到CMAKE_CUDA_ARCHITECTURES
的值。将其更改为您想要使用的值。同样,我提供的是最简单和最基本的方法,不一定是最花哨和最健壮的方法。我已经在fork of the
hello-cuda-cmake
repository中为您设置了所有这些。isr3a4wc2#
尝试添加:
检查https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/中的CUDA架构,并更改
CMAKE_CUDA_ARCHITECTURES
的参数。并将
CMAKE_CUDA_COMPILER
链接到nvcc。这是我完整CMakeLists.txt:
我的GPU是GeForce GTX 1660,CMake版本3.23,CUDA版本11.6。
这是我为开发一些项目而制作的Docker图像:https://github.com/GuangchenJ/cuda-dev,您可以尝试使用它。
gudnpqoy3#
操作系统环境:
1.可以制造
此项目名称为:
hellogpu
cmake文件:
7fyelxc54#
我遇到了同样的问题,我通过安装旧版本的CMake解决了它。更准确地说:3.18之前版本。
显然CMake在3.18中增加了对CUDA的第一方语言支持,这就是这些无意义的问题(
"Try: indead"
)的来源。