jcuda.runtime.JCuda.cudaMalloc3DArray()方法的使用及代码示例

x33g5p2x  于2022-01-22 转载在 其他  
字(6.8k)|赞(0)|评价(0)|浏览(187)

本文整理了Java中jcuda.runtime.JCuda.cudaMalloc3DArray()方法的一些代码示例,展示了JCuda.cudaMalloc3DArray()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JCuda.cudaMalloc3DArray()方法的具体详情如下:
包路径:jcuda.runtime.JCuda
类名称:JCuda
方法名:cudaMalloc3DArray

JCuda.cudaMalloc3DArray介绍

[英]Allocate an array on the device.

cudaError_t cudaMalloc3DArray ( 
cudaArray_t* array, 
const cudaChannelFormatDesc* desc, 
cudaExtent extent, 
unsigned int  flags = 0 )

Allocate an array on the device. Allocates a CUDA array according to the cudaChannelFormatDesc structure desc and returns a handle to the new CUDA array in *array.

The cudaChannelFormatDesc is defined as:

struct cudaChannelFormatDesc { 
int x, y, z, w; 
enum cudaChannelFormatKind 
f; 
};

where cudaChannelFormatKind is one of cudaChannelFormatKindSigned, cudaChannelFormatKindUnsigned, or cudaChannelFormatKindFloat.

cudaMalloc3DArray() can allocate the following:

  • A 1D array is allocated if the height and depth extents are both zero.
  • A 2D array is allocated if only the depth extent is zero.
  • A 3D array is allocated if all three extents are non-zero.
  • A 1D layered CUDA array is allocated if only the height extent is zero and the cudaArrayLayered flag is set. Each layer is a 1D array. The number of layers is determined by the depth extent.
  • A 2D layered CUDA array is allocated if all three extents are non-zero and the cudaArrayLayered flag is set. Each layer is a 2D array. The number of layers is determined by the depth extent.
  • A cubemap CUDA array is allocated if all three extents are non-zero and the cudaArrayCubemap flag is set. Width must be equal to height, and depth must be six. A cubemap is a special type of 2D layered CUDA array, where the six layers represent the six faces of a cube. The order of the six layers in memory is the same as that listed in cudaGraphicsCubeFace.
  • A cubemap layered CUDA array is allocated if all three extents are non-zero, and both, cudaArrayCubemap and cudaArrayLayered flags are set. Width must be equal to height, and depth must be a multiple of six. A cubemap layered CUDA array is a special type of 2D layered CUDA array that consists of a collection of cubemaps. The first six layers represent the first cubemap, the next six layers form the second cubemap, and so on.

The flags parameter enables different options to be specified that affect the allocation, as follows.

  • cudaArrayDefault: This flag's value is defined to be 0 and provides default array allocation
  • cudaArrayLayered: Allocates a layered CUDA array, with the depth extent indicating the number of layers
  • cudaArrayCubemap: Allocates a cubemap CUDA array. Width must be equal to height, and depth must be six. If the cudaArrayLayered flag is also set, depth must be a multiple of six.
  • cudaArraySurfaceLoadStore: Allocates a CUDA array that could be read from or written to using a surface reference.
  • cudaArrayTextureGather: This flag indicates that texture gather operations will be performed on the CUDA array. Texture gather can only be performed on 2D CUDA arrays.

The width, height and depth extents must meet certain size requirements as listed in the following table. All values are specified in elements.

Note that 2D CUDA arrays have different size requirements if the cudaArrayTextureGather flag is set. In that case, the valid range for (width, height, depth) is ((1,maxTexture2DGather[0]), (1,maxTexture2DGather[1]), 0).
CUDA array type

Valid extents that must always be met {(width range in elements), (height range), (depth range)}

Valid extents with cudaArraySurfaceLoadStore set {(width range in elements), (height range), (depth range)}
1D

{ (1,maxTexture1D), 0, 0 }

{ (1,maxSurface1D), 0, 0 }
2D

{ (1,maxTexture2D[0]), (1,maxTexture2D[1]), 0 }

{ (1,maxSurface2D[0]), (1,maxSurface2D[1]), 0 }
3D

{ (1,maxTexture3D[0]), (1,maxTexture3D[1]), (1,maxTexture3D[2]) }

{ (1,maxSurface3D[0]), (1,maxSurface3D[1]), (1,maxSurface3D[2]) }
1D Layered

{ (1,maxTexture1DLayered[0]), 0, (1,maxTexture1DLayered[1]) }

{ (1,maxSurface1DLayered[0]), 0, (1,maxSurface1DLayered[1]) }
2D Layered

{ (1,maxTexture2DLayered[0]), (1,maxTexture2DLayered[1]), (1,maxTexture2DLayered[2]) }

{ (1,maxSurface2DLayered[0]), (1,maxSurface2DLayered[1]), (1,maxSurface2DLayered[2]) }
Cubemap

{ (1,maxTextureCubemap), (1,maxTextureCubemap), 6 }

{ (1,maxSurfaceCubemap), (1,maxSurfaceCubemap), 6 }
Cubemap Layered

{ (1,maxTextureCubemapLayered[0]), (1,maxTextureCubemapLayered[0]), (1,maxTextureCubemapLayered[1]) }

{ (1,maxSurfaceCubemapLayered[0]), (1,maxSurfaceCubemapLayered[0]), (1,maxSurfaceCubemapLayered[1]) }

Note:

Note that this function may also return error codes from previous, asynchronous launches.
[中]在设备上分配一个数组

cudaError_t cudaMalloc3DArray ( 
cudaArray_t* array, 
const cudaChannelFormatDesc* desc, 
cudaExtent extent, 
unsigned int  flags = 0 )

在设备上分配一个数组。根据cudaChannelFormatDesc结构desc分配CUDA数组,并在*数组中返回新CUDA数组的句柄。
CUDAChannel FormatDesc定义为:

struct cudaChannelFormatDesc { 
int x, y, z, w; 
enum cudaChannelFormatKind 
f; 
};

其中cudaChannelFormatKind是cudaChannelFormatKindSigned、cudaChannelFormatKindUnsigned或cudaChannelFormatKindFloat中的一种。
cudaMalloc3DArray()可以分配以下内容:
*如果高度和深度范围均为零,则分配1D阵列。
*如果仅深度范围为零,则分配二维数组。
*如果所有三个扩展数据块均为非零,则分配三维阵列。
*如果仅高度范围为零且设置了cudaArrayLayered标志,则分配1D分层CUDA阵列。每一层都是一个一维阵列。层数由深度范围决定。
*如果所有三个区段均为非零且cudaArrayLayered标志已设置,则分配二维分层CUDA阵列。每个层都是一个二维阵列。层数由深度范围决定。
*如果所有三个区段均为非零且cudaArrayCubemap标志已设置,则分配cubemap CUDA阵列。宽度必须等于高度,深度必须为六。立方体贴图是一种特殊类型的二维分层CUDA阵列,其中六层表示立方体的六个面。内存中六层的顺序与cudaGraphicsCubeFace中列出的顺序相同。
*如果所有三个扩展数据块都不为零,并且cudaArrayCubemap和CUDAARRAYLAYLAYERED标志都已设置,则分配cubemap分层CUDA阵列。宽度必须等于高度,深度必须是6的倍数。cubemap分层CUDA阵列是一种特殊类型的2D分层CUDA阵列,由一组cubemap组成。前六层表示第一个立方体贴图,后六层形成第二个立方体贴图,依此类推。
flags参数允许指定影响分配的不同选项,如下所示。
*cudaArrayDefault:此标志的值定义为0,并提供默认数组分配
*cudaArrayLayered:分配分层CUDA数组,深度范围指示层数
*cudaArrayCubemap:分配一个cubemap CUDA数组。宽度必须等于高度,深度必须为六。如果还设置了cudaArrayLayered标志,则深度必须是6的倍数。
*cudaArraySurfaceLoadStore:分配一个CUDA数组,该数组可以使用曲面引用读取或写入。
*cudaArrayTextureGather:此标志表示纹理聚集操作将在CUDA阵列上执行。纹理聚集只能在2D CUDA阵列上执行。
宽度、高度和深度范围必须满足下表中列出的特定尺寸要求。所有值都在元素中指定。
请注意,如果设置了cudaArrayTextureGather标志,则2D CUDA阵列具有不同的大小要求。在这种情况下,(宽度、高度、深度)的有效范围是((1,maxTexture2DGather[0]),(1,maxTexture2DGather[1]),0)。
CUDA阵列类型
必须始终满足的有效范围{(元素中的宽度范围),(高度范围),(深度范围)}
cudaArraySurfaceLoadStore设置为{(元素中的宽度范围),(高度范围),(深度范围)的有效数据块
1D
{(1,maxture1d),0,0}
{(1,maxSurface1D),0,0}
二维
{(1,maxTexture2D[0]),(1,maxTexture2D[1]),0}
{(1,maxSurface2D[0]),(1,maxSurface2D[1]),0}
三维
{(1,maxTexture3D[0]),(1,maxTexture3D[1]),(1,maxTexture3D[2])
{(1,maxSurface3D[0]),(1,maxSurface3D[1]),(1,maxSurface3D[2])}
一维分层
{(1,maxTexture1DLayered[0]),0,(1,maxTexture1DLayered[1])}
{(1,maxSurface1DLayered[0]),0,(1,maxSurface1DLayered[1])}
二维分层
{(1,maxture2dlayered[0]),(1,maxture2dlayered[1]),(1,maxture2dlayered[2])
{(1,maxSurface2DLayered[0]),(1,maxSurface2DLayered[1]),(1,maxSurface2DLayered[2])
立方体贴图
{(1,maxTextureCubemap),(1,maxTextureCubemap),6}
{(1,maxSurfaceCubemap),(1,maxSurfaceCubemap),6}
立方映射分层
{(1,maxTextureCubemapLayered[0]),(1,maxTextureCubemapLayered[0]),(1,maxTextureCubemapLayered[1])
{(1,maxSurfaceCubemapLayered[0]),(1,maxSurfaceCubemapLayered[0]),(1,maxSurfaceCubemapLayered[1])
注:
请注意,此函数还可能返回以前异步启动的错误代码。

代码示例

代码示例来源:origin: org.jcuda/jcuda

return cudaMalloc3DArray(arrayPtr, desc, extent, 0);

代码示例来源:origin: org.nd4j/jcuda-windows64

return cudaMalloc3DArray(arrayPtr, desc, extent, 0);

代码示例来源:origin: org.nd4j/jcuda

return cudaMalloc3DArray(arrayPtr, desc, extent, 0);

代码示例来源:origin: org.nd4j/nd4j-jcublas-common

return cudaMalloc3DArray(arrayPtr, desc, extent, 0);

相关文章

JCuda类方法