pytorch 如何将cpp_extension与一个在libtorch之上构建的现有库一起使用?

cyvaqqii  于 2023-06-23  发布在  其他
关注(0)|答案(1)|浏览(93)

我已经成功地完成了c++/CUDA extension tutorial。我编写并绑定了一个与他们使用的不同的实用程序。
然后,我使用扩展教程中使用的同一实用程序编写了和compiled a CMake project to create a static library
接下来,我想知道如何使用我构建的库(.lib)并将其绑定到Python。我想把我上面链接的两个教程混合在一起。

我试过了:

在我最初的项目中,库存在,我添加了一个新目录pybind

├───external
│   └───libtorch
│       ├───bin
│       ├───cmake
│       ├───include
│
├───libray
│   │   CMakeLists.txt
│   │
│   ├───include    *** Public API of libray ***
│   │       CMakeLists.txt
│   │       raycast.cpp
│   │       raycast.h
│   │
│   └───src        *** Helpers, PRIVATE in cmake ***
│           raycast_cuda.cu
│           raycast_cuda.cuh
│           torch_checks.cpp
│           torch_checks.h
│
├───pybind
            extension.cpp
            setup.py

在该目录中,我添加了extension.cpp文件,该文件将包含我要绑定的实用程序的头文件,然后绑定这些头文件包含的函数:

#include <torch/extension.h>
#include "raycast.h"

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("find_distance", &find_distances, "Get minimum distance from ray origin to surface.");
  m.def("find_intersections", &find_intersections, "Get ray's intersection with surface.");
}

沿着我问题中第一个链接中描述的setup.py

from setuptools import setup
from torch.utils import cpp_extension

setup(name='raycast',
      ext_modules=
      [
          Extension(
              name="raycast", 
              sources=['extension.cpp'],
              include_dirs=[
                  '../external/libtorch/include', 
                  "../external/libtorch/include/torch/csrc/api/include",
                  "../libray/include"],
              library_dirs=[
                  "../build/libray/include/Debug/", 
                  "../external/libtorch/lib"],
              libraries=["libray", "torch", "torch_cpu", "torch_cuda", "c10"]
      ],
      cmdclass={'build_ext': cpp_extension.BuildExtension})

**注意:**我还将私有头文件添加到ext_modules列表中。这违背了问题的目的,也不起作用。

会发生什么

1.当我运行python setup.py install时,我得到一个错误,说有17个未解决的外部。

extension.obj : error LNK2001: unresolved external symbol "class at::Tensor __cdecl find_distances(class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor)" (?find_distances@@YA?AVTensor@at@@V12@000@Z)
extension.obj : error LNK2001: unresolved external symbol "__declspec(dllimport) public: __cdecl pybind11::detail::type_caster<class at::Tensor,void>::type_caster<class at::Tensor,void>(void)" (__imp_??0?$type_caster@VTensor@at@@X@detail@pybind11@@QEAA@XZ)
extension.obj : error LNK2001: unresolved external symbol "__declspec(dllimport) public: __cdecl pybind11::detail::type_caster<class at::Tensor,void>::~type_caster<class at::Tensor,void>(void)" (__imp_??1?$type_caster@VTensor@at@@X@detail@pybind11@@QEAA@XZ)
extension.obj : error LNK2001: unresolved external symbol "__declspec(dllimport) public: __cdecl pybind11::detail::type_caster<class at::Tensor,void>::operator class at::Tensor &&(void)&& " (__imp_??B?$type_caster@VTensor@at@@X@detail@pybind11@@QEHAA$$QEAVTensor@at@@XZ)
libray.lib(raycast.obj) : error LNK2001: unresolved external symbol __imp__invalid_parameter
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __imp__invalid_parameter
libray.lib(torch_checks.obj) : error LNK2001: unresolved external symbol __imp__invalid_parameter
libray.lib(raycast.obj) : error LNK2001: unresolved external symbol __imp__calloc_dbg
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __imp__calloc_dbg
libray.lib(torch_checks.obj) : error LNK2001: unresolved external symbol __imp__calloc_dbg
libray.lib(raycast.obj) : error LNK2001: unresolved external symbol __imp__CrtDbgReport
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __imp__CrtDbgReport
libray.lib(torch_checks.obj) : error LNK2001: unresolved external symbol __imp__CrtDbgReport
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol cudaLaunchKernel
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaPushCallConfiguration
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaPopCallConfiguration
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaRegisterFatBinary
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaRegisterFatBinaryEnd
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaUnregisterFatBinary
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaRegisterVar
libray.lib(raycast_cuda.obj) : error LNK2001: unresolved external symbol __cudaRegisterFunction
build\lib.win-amd64-3.8\raycast.cp38-win_amd64.pyd : fatal error LNK1120: 17 unresolved externals

1.我也有这些错误,但比另一个少得多。

libray.lib(raycast.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '2' doesn't match value '0' in extension.obj
libray.lib(raycast.obj) : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MDd_DynamicDebug' doesn't match value 'MD_DynamicRelease' in extension.obj
libray.lib(raycast_cuda.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '2' doesn't match value '0' in extension.obj
libray.lib(raycast_cuda.obj) : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MDd_DynamicDebug' doesn't match value 'MD_DynamicRelease' in extension.obj
libray.lib(torch_checks.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '2' doesn't match value '0' in extension.obj
libray.lib(torch_checks.obj) : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MDd_DynamicDebug' doesn't match value 'MD_DynamicRelease' in extension.obj
   Creating library build\temp.win-amd64-3.8\Release\raycast.cp38-win_amd64.lib and object build\temp.win-amd64-3.8\Release\raycast.cp38-win_amd64.exp
LINK : warning LNK4098: defaultlib 'MSVCRTD' conflicts with use of other libs; use /NODEFAULTLIB:library

LNK2038和LNK2001错误的含义。通过包含更多来自libtorch/lib/.lib文件,我能够将LNK2001错误从158减少到17。然而,我似乎仍然错过了一些东西。

我的问题

我遗漏了什么,是包含路径还是额外的库来解决这些问题?

vpfxa7rd

vpfxa7rd1#

为了解决这个问题,我最后查看了通过调用cpp_extension.CUDAExtension创建的setuptools.Extension对象的内容。

from setuptools import setup, Extension
from torch.utils import cpp_extension

cuda_extension = cpp_extension.CUDAExtension('raycast', ['raycast.cpp', 'raycast_cuda.cu', 'torch_checks.cpp'])

for attribute in dir(cuda_extension):
    attr = getattr(cuda_extension, attribute)
    print("*********************************")
    print(attribute)
    print(attr)
    print("*********************************")

这最终产生了我错过的图书馆和它们的位置。

*********************************
include_dirs
['C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include', 'C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include\\torch\\csrc\\api\\include', 'C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include\\TH', 'C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include\\THC', 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\include']
*********************************

*********************************
libraries
['c10', 'torch', 'torch_cpu', 'torch_python', 'cudart', 'c10_cuda', 'torch_cuda_cu', 'torch_cuda_cpp']
*********************************

*********************************
library_dirs
['C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\lib', 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\lib/x64'] 
*********************************

奇怪的是,这仍然失败了,因为找不到库torch_python.lib。因此,我在conda env中搜索它,并将其位置添加到library_dirs列表中。而且成功了!
搜索路径中可能有一些冗余,因为我包括了libtorch目录(存储在我的项目中)和torch目录(来自我的conda env pytorch安装)。以下是最终的工作setup.py

from setuptools import setup, Extension
from torch.utils import cpp_extension

setup(name='raycast',
      ext_modules=
      [
          Extension(
              name="raycast", 
              sources=['extension.cpp'],
              
              include_dirs= 
                  cpp_extension.include_paths() + 
                  [
                        '../external/libtorch/include', 
                        "../external/libtorch/include/torch/csrc/api/include",
                        "../libray/include",
                        'C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include',
                        'C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include\\torch\\csrc\\api\\include', 
                        'C:\\Users\\wesle\\anaconda3\\lib\\site-packages\\torch\\include\\TH',
                        'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\include'
                  ],

              library_dirs=[
                  "../build/libray/include/Release/", 
                  "../external/libtorch/lib",
                  "C:\\Users\\wesle\\anaconda3\\Lib\\site-packages\\torch\\lib",
                  'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\lib/x64'],

              libraries=["libray", 'c10', 'torch', 'torch_cpu', 
                         'torch_python', 'cudart', 'c10_cuda', 
                         'torch_cuda_cu', 'torch_cuda_cpp']) 
      ],
      cmdclass={'build_ext': cpp_extension.BuildExtension})

相关问题