CUDA Driver API :: CUDA Toolkit Documentation

6.16. 多播对象管理

本节介绍由底层CUDA驱动程序应用程序编程接口公开的CUDA多播对象操作。

概述

通过cuMulticastCreate创建的多播对象能够将特定内存操作广播到一组设备。设备可以通过cuMulticastAddDevice添加到多播对象中。内存可以通过cuMulticastBindMem或cuMulticastBindAddr绑定到每个参与设备上。多播对象可以使用虚拟内存管理API映射到设备的虚拟地址空间中（参见cuMemMap和cuMemSetAccess）。

支持的平台

可以通过设备属性CU_DEVICE_ATTRIBUTE_MULTICAST_SUPPORTED查询特定设备是否支持多播功能

Functions

CUresult cuMulticastAddDevice ( CUmemGenericAllocationHandle mcHandle, CUdevice dev ): Associate a device to a multicast object.
CUresult cuMulticastBindAddr ( CUmemGenericAllocationHandle mcHandle, size_t mcOffset, CUdeviceptr memptr, size_t size, unsigned long long flags ): Bind a memory allocation represented by a virtual address to a multicast object.
CUresult cuMulticastBindMem ( CUmemGenericAllocationHandle mcHandle, size_t mcOffset, CUmemGenericAllocationHandle memHandle, size_t memOffset, size_t size, unsigned long long flags ): Bind a memory allocation represented by a handle to a multicast object.
CUresult cuMulticastCreate ( CUmemGenericAllocationHandle* mcHandle, const CUmulticastObjectProp* prop ): Create a generic allocation handle representing a multicast object described by the given properties.
CUresult cuMulticastGetGranularity ( size_t* granularity, const CUmulticastObjectProp* prop, CUmulticastGranularity_flags option ): Calculates either the minimal or recommended granularity for multicast object.
CUresult cuMulticastUnbind ( CUmemGenericAllocationHandle mcHandle, CUdevice dev, size_t mcOffset, size_t size ): Unbind any memory allocations bound to a multicast object at a given offset and upto a given size.

Functions

CUresult cuMulticastAddDevice ( CUmemGenericAllocationHandle mcHandle, CUdevice dev )

将设备关联到多播对象。

参数

mcHandle: Handle representing a multicast object.
dev: Device that will be associated to the multicast object.

CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_PERMITTED, CUDA_ERROR_NOT_SUPPORTED

描述

将设备关联到多播对象。添加的设备将成为cuMulticastCreate期间通过CUmulticastObjectProp::numDevices指定规模的多播团队成员。在多播对象的生命周期内，设备与多播对象的关联是永久性的。在将内存绑定到团队中任何设备之前，必须将所有设备添加到多播团队中。任何对cuMulticastBindMem或cuMulticastBindAddr的调用都将阻塞，直到所有设备添加完毕。同样，在将虚拟地址范围映射到多播对象之前，必须将所有设备添加到多播团队中。调用cuMemMap将阻塞，直到所有设备添加完毕。

另请参阅：

cuMulticastCreate, cuMulticastBindMem, cuMulticastBindAddr

CUresult cuMulticastBindAddr ( CUmemGenericAllocationHandle mcHandle, size_t mcOffset, CUdeviceptr memptr, size_t size, unsigned long long flags )

将虚拟地址表示的内存分配绑定到多播对象。

参数

mcHandle: Handle representing a multicast object.
mcOffset: Offset into multicast va range for attachment.
memptr: Virtual address of the memory allocation.
size: Size of memory that will be bound to the multicast object.
flags: Flags for future use, must be zero now.

CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_PERMITTED, CUDA_ERROR_NOT_SUPPORTED, CUDA_ERROR_OUT_OF_MEMORY, CUDA_ERROR_SYSTEM_NOT_READY

描述

将由其映射地址memptr指定的内存分配绑定到由mcHandle表示的多播对象。该内存必须通过cuMemCreate或cudaMallocAsync分配。绑定操作的预期size、多播范围中的偏移量mcOffset和memptr必须是使用标志CU_MULTICAST_GRANULARITY_MINIMUM调用cuMulticastGetGranularity返回值的倍数。但为了获得最佳性能，size、mcOffset和memptr应按照使用标志CU_MULTICAST_GRANULARITY_RECOMMENDED调用cuMulticastGetGranularity返回的值进行对齐。

The size cannot be larger than the size of the allocated memory. Similarly the size + mcOffset cannot be larger than the total size of the multicast object. The memory allocation must have beeen created on one of the devices that was added to the multicast team via cuMulticastAddDevice. Externally shareable as well as imported multicast objects can be bound only to externally shareable memory. Note that this call will return CUDA_ERROR_OUT_OF_MEMORY if there are insufficient resources required to perform the bind. This call may also return CUDA_ERROR_SYSTEM_NOT_READY if the necessary system software is not initialized or running.

另请参阅：

cuMulticastCreate, cuMulticastAddDevice, cuMemCreate

CUresult cuMulticastBindMem ( CUmemGenericAllocationHandle mcHandle, size_t mcOffset, CUmemGenericAllocationHandle memHandle, size_t memOffset, size_t size, unsigned long long flags )

将内存分配（由句柄表示）绑定到多播对象。

参数

mcHandle: Handle representing a multicast object.
mcOffset: Offset into the multicast object for attachment.
memHandle: Handle representing a memory allocation.
memOffset: Offset into the memory for attachment.
size: Size of the memory that will be bound to the multicast object.
flags: Flags for future use, must be zero for now.

描述

将由memHandle指定并通过cuMemCreate创建的内存分配绑定到由mcHandle表示并通过cuMulticastCreate创建的多播对象。绑定操作的预期size、多播范围偏移量mcOffset以及内存偏移量memOffset必须是使用CU_MULTICAST_GRANULARITY_MINIMUM标志调用cuMulticastGetGranularity返回值的整数倍。但为了获得最佳性能，size、mcOffset和memOffset应该按照内存分配的粒度（参见::cuMemGetAllocationGranularity）或使用CU_MULTICAST_GRANULARITY_RECOMMENDED标志调用cuMulticastGetGranularity返回的值进行对齐。

The size + memOffset cannot be larger than the size of the allocated memory. Similarly the size + mcOffset cannot be larger than the size of the multicast object. The memory allocation must have beeen created on one of the devices that was added to the multicast team via cuMulticastAddDevice. Externally shareable as well as imported multicast objects can be bound only to externally shareable memory. Note that this call will return CUDA_ERROR_OUT_OF_MEMORY if there are insufficient resources required to perform the bind. This call may also return CUDA_ERROR_SYSTEM_NOT_READY if the necessary system software is not initialized or running.

另请参阅：

cuMulticastCreate, cuMulticastAddDevice, cuMemCreate

CUresult cuMulticastCreate ( CUmemGenericAllocationHandle* mcHandle, const CUmulticastObjectProp* prop )

创建一个通用的分配句柄，代表由给定属性描述的多播对象。

参数

mcHandle: Value of handle returned.
prop: Properties of the multicast object to create.

CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_PERMITTED, CUDA_ERROR_NOT_SUPPORTED

描述

这将根据prop描述创建一个多播对象。参与设备的数量由CUmulticastObjectProp::numDevices指定。设备可以通过cuMulticastAddDevice添加到多播对象中。在内存可以绑定到多播对象之前，所有参与设备都必须添加到该对象中。内存通过cuMulticastBindMem或cuMulticastBindAddr绑定到多播对象，并可以通过cuMulticastUnbind解除绑定。每个设备可绑定的内存总量由CUmulticastObjectProp::size指定。此大小必须是使用标志CU_MULTICAST_GRANULARITY_MINIMUM调用cuMulticastGetGranularity返回值的倍数。但为了获得最佳性能，建议将大小对齐到使用标志CU_MULTICAST_GRANULARITY_RECOMMENDED调用cuMulticastGetGranularity返回的值。

在所有参与设备添加完毕后，多播对象也可以通过虚拟内存管理API映射到设备的虚拟地址空间（参见cuMemMap和cuMemSetAccess）。多播对象还可以通过cuMemExportToShareableHandle请求可共享句柄来与其他进程共享。请注意，必须在位掩码CUmulticastObjectProp::handleTypes中指定所需类型的可共享句柄。多播对象可以使用虚拟内存管理APIcuMemRelease进行释放。

另请参阅：

cuMulticastAddDevice, cuMulticastBindMem, cuMulticastBindAddr, cuMulticastUnbind

cuMemCreate, cuMemRelease, cuMemExportToShareableHandle, cuMemImportFromShareableHandle

CUresult cuMulticastGetGranularity ( size_t* granularity, const CUmulticastObjectProp* prop, CUmulticastGranularity_flags option )

计算多播对象的最小或推荐粒度。

参数

granularity: Returned granularity.
prop: Properties of the multicast object.
option: Determines which granularity to return.

CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_PERMITTED, CUDA_ERROR_NOT_SUPPORTED

描述

计算给定多播对象属性的最小或推荐粒度，并将结果返回为粒度值。该粒度可用作多播对象的大小、绑定偏移量和地址映射的倍数。

另请参阅：

cuMulticastCreate, cuMulticastBindMem, cuMulticastBindAddr, cuMulticastUnbind

CUresult cuMulticastUnbind ( CUmemGenericAllocationHandle mcHandle, CUdevice dev, size_t mcOffset, size_t size )

解除在指定偏移量和大小范围内绑定到多播对象的所有内存分配。

参数

mcHandle: Handle representing a multicast object.
dev: Device that hosts the memory allocation.
mcOffset: Offset into the multicast object.
size: Desired size to unbind.

CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_PERMITTED, CUDA_ERROR_NOT_SUPPORTED

描述

解除绑定在设备dev上分配并绑定到多播对象mcOffset位置、大小不超过指定size的任何内存分配。指定的解除绑定范围size以及多播范围内的偏移量(mcOffset)必须是cuMulticastGetGranularity标志CU_MULTICAST_GRANULARITY_MINIMUM返回值的整数倍。size + mcOffset的总和不能超过多播对象的总大小。

Note:

警告：mcOffset和size必须与绑定调用时指定的对应值匹配。任何其他值都可能导致未定义行为。

另请参阅：

cuMulticastBindMem, cuMulticastBindAddr