本文整理了Java中jcuda.runtime.JCuda.cudaMalloc3DArray()
方法的一些代码示例,展示了JCuda.cudaMalloc3DArray()
的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JCuda.cudaMalloc3DArray()
方法的具体详情如下:
包路径:jcuda.runtime.JCuda
类名称:JCuda
方法名:cudaMalloc3DArray
[英]Allocate an array on the device.
cudaError_t cudaMalloc3DArray (
cudaArray_t* array,
const cudaChannelFormatDesc* desc,
cudaExtent extent,
unsigned int flags = 0 )
Allocate an array on the device. Allocates a CUDA array according to the cudaChannelFormatDesc structure desc and returns a handle to the new CUDA array in *array.
The cudaChannelFormatDesc is defined as:
struct cudaChannelFormatDesc {
int x, y, z, w;
enum cudaChannelFormatKind
f;
};
where cudaChannelFormatKind is one of cudaChannelFormatKindSigned, cudaChannelFormatKindUnsigned, or cudaChannelFormatKindFloat.
cudaMalloc3DArray() can allocate the following:
The flags parameter enables different options to be specified that affect the allocation, as follows.
The width, height and depth extents must meet certain size requirements as listed in the following table. All values are specified in elements.
Note that 2D CUDA arrays have different size requirements if the cudaArrayTextureGather flag is set. In that case, the valid range for (width, height, depth) is ((1,maxTexture2DGather[0]), (1,maxTexture2DGather[1]), 0).
CUDA array type
Valid extents that must always be met {(width range in elements), (height range), (depth range)}
Valid extents with cudaArraySurfaceLoadStore set {(width range in elements), (height range), (depth range)}
1D
{ (1,maxTexture1D), 0, 0 }
{ (1,maxSurface1D), 0, 0 }
2D
{ (1,maxTexture2D[0]), (1,maxTexture2D[1]), 0 }
{ (1,maxSurface2D[0]), (1,maxSurface2D[1]), 0 }
3D
{ (1,maxTexture3D[0]), (1,maxTexture3D[1]), (1,maxTexture3D[2]) }
{ (1,maxSurface3D[0]), (1,maxSurface3D[1]), (1,maxSurface3D[2]) }
1D Layered
{ (1,maxTexture1DLayered[0]), 0, (1,maxTexture1DLayered[1]) }
{ (1,maxSurface1DLayered[0]), 0, (1,maxSurface1DLayered[1]) }
2D Layered
{ (1,maxTexture2DLayered[0]), (1,maxTexture2DLayered[1]), (1,maxTexture2DLayered[2]) }
{ (1,maxSurface2DLayered[0]), (1,maxSurface2DLayered[1]), (1,maxSurface2DLayered[2]) }
Cubemap
{ (1,maxTextureCubemap), (1,maxTextureCubemap), 6 }
{ (1,maxSurfaceCubemap), (1,maxSurfaceCubemap), 6 }
Cubemap Layered
{ (1,maxTextureCubemapLayered[0]), (1,maxTextureCubemapLayered[0]), (1,maxTextureCubemapLayered[1]) }
{ (1,maxSurfaceCubemapLayered[0]), (1,maxSurfaceCubemapLayered[0]), (1,maxSurfaceCubemapLayered[1]) }
Note:
Note that this function may also return error codes from previous, asynchronous launches.
[中]在设备上分配一个数组
cudaError_t cudaMalloc3DArray (
cudaArray_t* array,
const cudaChannelFormatDesc* desc,
cudaExtent extent,
unsigned int flags = 0 )
在设备上分配一个数组。根据cudaChannelFormatDesc结构desc分配CUDA数组,并在*数组中返回新CUDA数组的句柄。
CUDAChannel FormatDesc定义为:
struct cudaChannelFormatDesc {
int x, y, z, w;
enum cudaChannelFormatKind
f;
};
其中cudaChannelFormatKind是cudaChannelFormatKindSigned、cudaChannelFormatKindUnsigned或cudaChannelFormatKindFloat中的一种。
cudaMalloc3DArray()可以分配以下内容:
*如果高度和深度范围均为零,则分配1D阵列。
*如果仅深度范围为零,则分配二维数组。
*如果所有三个扩展数据块均为非零,则分配三维阵列。
*如果仅高度范围为零且设置了cudaArrayLayered标志,则分配1D分层CUDA阵列。每一层都是一个一维阵列。层数由深度范围决定。
*如果所有三个区段均为非零且cudaArrayLayered标志已设置,则分配二维分层CUDA阵列。每个层都是一个二维阵列。层数由深度范围决定。
*如果所有三个区段均为非零且cudaArrayCubemap标志已设置,则分配cubemap CUDA阵列。宽度必须等于高度,深度必须为六。立方体贴图是一种特殊类型的二维分层CUDA阵列,其中六层表示立方体的六个面。内存中六层的顺序与cudaGraphicsCubeFace中列出的顺序相同。
*如果所有三个扩展数据块都不为零,并且cudaArrayCubemap和CUDAARRAYLAYLAYERED标志都已设置,则分配cubemap分层CUDA阵列。宽度必须等于高度,深度必须是6的倍数。cubemap分层CUDA阵列是一种特殊类型的2D分层CUDA阵列,由一组cubemap组成。前六层表示第一个立方体贴图,后六层形成第二个立方体贴图,依此类推。
flags参数允许指定影响分配的不同选项,如下所示。
*cudaArrayDefault:此标志的值定义为0,并提供默认数组分配
*cudaArrayLayered:分配分层CUDA数组,深度范围指示层数
*cudaArrayCubemap:分配一个cubemap CUDA数组。宽度必须等于高度,深度必须为六。如果还设置了cudaArrayLayered标志,则深度必须是6的倍数。
*cudaArraySurfaceLoadStore:分配一个CUDA数组,该数组可以使用曲面引用读取或写入。
*cudaArrayTextureGather:此标志表示纹理聚集操作将在CUDA阵列上执行。纹理聚集只能在2D CUDA阵列上执行。
宽度、高度和深度范围必须满足下表中列出的特定尺寸要求。所有值都在元素中指定。
请注意,如果设置了cudaArrayTextureGather标志,则2D CUDA阵列具有不同的大小要求。在这种情况下,(宽度、高度、深度)的有效范围是((1,maxTexture2DGather[0]),(1,maxTexture2DGather[1]),0)。
CUDA阵列类型
必须始终满足的有效范围{(元素中的宽度范围),(高度范围),(深度范围)}
cudaArraySurfaceLoadStore设置为{(元素中的宽度范围),(高度范围),(深度范围)的有效数据块
1D
{(1,maxture1d),0,0}
{(1,maxSurface1D),0,0}
二维
{(1,maxTexture2D[0]),(1,maxTexture2D[1]),0}
{(1,maxSurface2D[0]),(1,maxSurface2D[1]),0}
三维
{(1,maxTexture3D[0]),(1,maxTexture3D[1]),(1,maxTexture3D[2])
{(1,maxSurface3D[0]),(1,maxSurface3D[1]),(1,maxSurface3D[2])}
一维分层
{(1,maxTexture1DLayered[0]),0,(1,maxTexture1DLayered[1])}
{(1,maxSurface1DLayered[0]),0,(1,maxSurface1DLayered[1])}
二维分层
{(1,maxture2dlayered[0]),(1,maxture2dlayered[1]),(1,maxture2dlayered[2])
{(1,maxSurface2DLayered[0]),(1,maxSurface2DLayered[1]),(1,maxSurface2DLayered[2])
立方体贴图
{(1,maxTextureCubemap),(1,maxTextureCubemap),6}
{(1,maxSurfaceCubemap),(1,maxSurfaceCubemap),6}
立方映射分层
{(1,maxTextureCubemapLayered[0]),(1,maxTextureCubemapLayered[0]),(1,maxTextureCubemapLayered[1])
{(1,maxSurfaceCubemapLayered[0]),(1,maxSurfaceCubemapLayered[0]),(1,maxSurfaceCubemapLayered[1])
注:
请注意,此函数还可能返回以前异步启动的错误代码。
代码示例来源:origin: org.jcuda/jcuda
return cudaMalloc3DArray(arrayPtr, desc, extent, 0);
代码示例来源:origin: org.nd4j/jcuda-windows64
return cudaMalloc3DArray(arrayPtr, desc, extent, 0);
代码示例来源:origin: org.nd4j/jcuda
return cudaMalloc3DArray(arrayPtr, desc, extent, 0);
代码示例来源:origin: org.nd4j/nd4j-jcublas-common
return cudaMalloc3DArray(arrayPtr, desc, extent, 0);
内容来源于网络,如有侵权,请联系作者删除!