On 10/6/20 5:02 PM, Jakub Jelinek wrote: > On Tue, Oct 06, 2020 at 04:48:40PM +0200, Tom de Vries wrote: >> On 10/5/20 3:15 PM, Tom de Vries wrote: >>> On 2/7/20 4:29 PM, Jakub Jelinek wrote: >>>> On Fri, Feb 07, 2020 at 09:56:38AM +0100, Harwath, Frederik wrote: >>>>> * {target-32.c, thread-limit-2.c}: >>>>> no "usleep" implemented for nvptx. Cf. https://gcc.gnu.org/PR81690 >>>> >>>> Please don't, I want to deal with that using declare variant, just didn't >>>> get yet around to finishing the last patch needed for that. Will try next >>>> week. >>>> >>> >>> Hi Jakub, >>> >>> Ping, any update on this? > > Not finished the last step, I run into LTO issues. Will need to return to > that soon. > Last progress in "[RFH] LTO cgraph support for late declare variant > resolution" > mail from May on gcc-patches. >
Ack, thanks for the update. >> --- a/libgomp/testsuite/libgomp.c/target-32.c >> +++ b/libgomp/testsuite/libgomp.c/target-32.c >> @@ -1,6 +1,26 @@ >> #include <stdlib.h> >> #include <unistd.h> >> >> +extern void base_delay(int); > > No need to declare this one early. > >> +extern void nvptx_delay(int); > > Space before (, and the definition could go here instead of > the declaration. > >> +#pragma omp declare variant( nvptx_delay ) match( construct={target}, >> implementation={vendor(nvidia)} ) > > This isn't the right declare variant for what we want though, > we only provide gnu as accepted vendor, it is implementation's vendor, > not vendor of one of the hw components. > So, it ought to be instead > #pragma omp declare variant (nvptx_delay) > match(construct={target},device={arch(nvptx)}) > >> +void base_delay(int d) >> +{ >> + usleep (d); >> +} I've updated the patch accordingly. FWIW, I now run into an ICE which looks like PR96680: ... lto1: internal compiler error: in lto_fixup_prevailing_decls, at lto/lto-common.c:2595^M 0x93afcd lto_fixup_prevailing_decls^M /home/vries/oacc/trunk/source-gcc/gcc/lto/lto-common.c:2595^M 0x93b1d6 lto_fixup_decls^M /home/vries/oacc/trunk/source-gcc/gcc/lto/lto-common.c:2645^M 0x93bcc4 read_cgraph_and_symbols(unsigned int, char const**)^M /home/vries/oacc/trunk/source-gcc/gcc/lto/lto-common.c:2897^M 0x910358 lto_main()^M /home/vries/oacc/trunk/source-gcc/gcc/lto/lto.c:625^M ... Thanks, - Tom
diff --git a/libgomp/testsuite/libgomp.c/target-32.c b/libgomp/testsuite/libgomp.c/target-32.c index 233877b702b..b8deae72b08 100644 --- a/libgomp/testsuite/libgomp.c/target-32.c +++ b/libgomp/testsuite/libgomp.c/target-32.c @@ -1,6 +1,25 @@ #include <stdlib.h> #include <unistd.h> +void +nvptx_delay (int d) +{ + /* This function serves as a replacement for usleep in + this test case. It does not even attempt to be functionally + equivalent - we just want some sort of delay. */ + int i; + int N = d * 2000; + for (i = 0; i < N; i++) + asm volatile ("" : : : "memory"); +} + +#pragma omp declare variant (nvptx_delay) match(construct={target},device={arch(nvptx)}) +void +base_delay(int d) +{ + usleep (d); +} + int main () { int a = 0, b = 0, c = 0, d[7]; @@ -18,28 +37,28 @@ int main () #pragma omp target nowait map(alloc: b) depend(in: d[2]) depend(out: d[3]) { - usleep (1000); + base_delay (1000); #pragma omp atomic update b |= 4; } #pragma omp target nowait map(alloc: b) depend(in: d[2]) depend(out: d[4]) { - usleep (5000); + base_delay (5000); #pragma omp atomic update b |= 1; } #pragma omp target nowait map(alloc: c) depend(in: d[3], d[4]) depend(out: d[5]) { - usleep (5000); + base_delay (5000); #pragma omp atomic update c |= 8; } #pragma omp target nowait map(alloc: c) depend(in: d[3], d[4]) depend(out: d[6]) { - usleep (1000); + base_delay (1000); #pragma omp atomic update c |= 2; }