On 21/11/2022 13:40, Tobias Burnus wrote:
Working on the builtins, I realized that I mixed up (again) bits and byes.
While 'uint64_t var[2]' has a size of 128 bits, 'char var[128]' has a size of 128 bytes. Thus, there is sufficient space for 16 pointer-size/uin64_t values but I only need 6.

This patch now makes use of the available space, avoiding one device-to-host memory copy; additionally, it avoids a 32bit vs 64bit alignment issue which I somehow missed :-(

Tested with libgomp on gfx908 offloading and getting only the known fails:
(libgomp.c-c++-common/teams-2.c, libgomp.fortran/async_io_*.f90,
libgomp.oacc-c-c++-common/{deep-copy-10.c,static-variable-1.c,vprop.c})

OK for mainline?

OK, although why not set value64 to 16 entries, even though reverse offload only uses 6?

Andrew

Reply via email to