Hi Andrew, hello world,
Now with AMD Instinct MI200 data - see below.
And a better look at the numbers. In terms of USM,
there does not seem to be any clear winner of both
approaches. If we want to draw conclusions, definitely
more runs are needed (statistics):
The runs below show that the diff
Andrew Stubbs wrote:
PS: I would love to do some comparisons [...]
Actually, I think testing only data transfer is fine for this, but we
might like to try some different access patterns, besides straight
linear copies.
I have now tried it on my laptop with
BabelStream,https://github.com/UoB-
On 03/06/2024 21:40, Tobias Burnus wrote:
Andrew Stubbs wrote:
On 03/06/2024 17:46, Tobias Burnus wrote:
Andrew Stubbs wrote:
+ /* If USM has been requested and is supported by all devices
+ of this type, set the capability accordingly. */
+ if (omp_requires_mask & GOMP
Andrew Stubbs wrote:
On 03/06/2024 17:46, Tobias Burnus wrote:
Andrew Stubbs wrote:
+ /* If USM has been requested and is supported by all devices
+ of this type, set the capability accordingly. */
+ if (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)
+
On 03/06/2024 17:46, Tobias Burnus wrote:
Andrew Stubbs wrote:
+ /* If USM has been requested and is supported by all devices
+ of this type, set the capability accordingly. */
+ if (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)
+ current_device.capab
Andrew Stubbs wrote:
+ /* If USM has been requested and is supported by all devices
+ of this type, set the capability accordingly. */
+ if (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)
+ current_device.capabilities |= GOMP_OFFLOAD_CAP_SHARED_MEM;
+
On 28/05/2024 23:33, Tobias Burnus wrote:
While most of the nvptx systems I have access to don't have the support
for CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES,
one has:
Tesla V100-SXM2-16GB (as installed, e.g., on ORNL's Summit) does support
this feature. And with that
On Wed, May 29, 2024 at 08:20:01AM +0200, Tobias Burnus wrote:
> + if (num_devices > 0
> + && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY))
> +for (int dev = 0; dev < num_devices; dev++)
> + {
> + int pi;
> + CUresult r;
> + r = CUDA_CALL_NOCHECK (cuDeviceGet
Tobias Burnus wrote:
While most of the nvptx systems I have access to don't have the
support for
CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES, one
has:
Actually, CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS is sufficient. And
I finally also found the proper webpage for this
While most of the nvptx systems I have access to don't have the support
for CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES,
one has:
Tesla V100-SXM2-16GB (as installed, e.g., on ORNL's Summit) does support
this feature. And with that feature, unified-shared memory support doe
10 matches
Mail list logo