On 21/11/2022 13:41, Tobias Burnus wrote:
On 19.11.22 11:46, Tobias Burnus wrote:
+       stacklimit = stackbase + seg_size*64;
(this should be '*seg_size' not 'seg_size' and the name should be
s/seg_size/seg_size_ptr/.)
I have updated the comment and ...
(Reading it, I think it should be '..._MEM(SImode,' and
'..._MULT(SImode' instead of DImode.)
Additionally, there was a problem of bytes vs. bits in:
My understanding is that
dispatch_ptr->private_segment_size == *((char*)dispatch_ptr + 192)

which is wrong - its 192 bits but only 24 bytes!

Finally, in the first_call_this_thread_p() call, I mixed up EQ vs. NE at one place.

BTW: It seems as if there is no problem with zero extension, if I look at the assembler result.

Updated version. Consists of: GCC patch adding the builtins,
the newlib patch using those (unchanged; used for testing + to be submitted), and
a 'test.c' using the builtins and its dump produced with amdgcn's
'cc1 -O2' to show the resulting assembly.

Tested with libgomp on gfx908 offloading and getting only the known fails:
(libgomp.c-c++-common/teams-2.c, libgomp.fortran/async_io_*.f90,
libgomp.oacc-c-c++-common/{deep-copy-10.c,static-variable-1.c,vprop.c})

OK for mainline?

OK, provided it has been tested in both stand-alone and offload modes, and the newlib tests too.

Andrew

Reply via email to