On 10/22/20 3:19 PM, Jakub Jelinek wrote: > On Tue, Oct 06, 2020 at 05:45:31PM +0200, Tom de Vries wrote: >> I've updated the patch accordingly. >> >> FWIW, I now run into an ICE which looks like PR96680: > > With the patch I've posted today to fix up declare variant LTO handling, > Tobias reported the patch still doesn't work, and there are two > reasons for that. > One is that when the base function is marked implicitly as declare target, > we don't mark also implicitly the variants. I'll need to ask on omp-lang > about details for that, but generally the compiler should do it some way. > The other one is that the way base_delay is written, it will always > call the usleep function, which is undesirable for nvptx. While the > compiler will replace all direct calls to base_delay to nvptx_delay, > the base_delay definition which calls usleep stays. > > The following should work instead (I've tested it without offloading and > Tobias with offloading): >
I've tested this patch in combination with: - "[PATCH] lto: LTO cgraph support for late declare variant resolution" https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556793.html - "[omp, simt] Handle alternative IV" https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555352.html on top of commit c26d7df1031 "OpenMP: Fortran - support omp flush's memorder clauses". The only FAILs I see are for PR97532 ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532 ), 10 in total. So, LGTM. Thanks, - Tom > 2020-10-22 Jakub Jelinek <ja...@redhat.com> > Tom de Vries <tdevr...@suse.de> > > PR testsuite/81690 > * testsuite/libgomp.c/usleep.h: New file. > * testsuite/libgomp.c/target-32.c: Include usleep.h. > (main): Use tgt_usleep instead of usleep. > * testsuite/libgomp.c/thread-limit-2.c: Include usleep.h. > (main): Use tgt_usleep instead of usleep. > > --- gcc/libgomp/testsuite/libgomp.c/usleep.h.jj 2020-10-22 > 14:45:14.034196695 +0200 > +++ gcc/libgomp/testsuite/libgomp.c/usleep.h 2020-10-22 14:48:05.186719495 > +0200 > @@ -0,0 +1,24 @@ > +#include <unistd.h> > + > +int > +nvptx_usleep (useconds_t d) > +{ > + /* This function serves as a replacement for usleep in > + this test case. It does not even attempt to be functionally > + equivalent - we just want some sort of delay. */ > + int i; > + int N = d * 2000; > + for (i = 0; i < N; i++) > + asm volatile ("" : : : "memory"); > + return 0; > +} > + > +#pragma omp declare variant (nvptx_usleep) > match(construct={target},device={arch(nvptx)}) > +#pragma omp declare variant (usleep) match(user={condition(1)}) > +int > +tgt_usleep (useconds_t d) > +{ > + return 0; > +} > + > +#pragma omp declare target to (nvptx_usleep, tgt_usleep) > --- gcc/libgomp/testsuite/libgomp.c/target-32.c.jj 2020-01-12 > 11:54:39.037373820 +0100 > +++ gcc/libgomp/testsuite/libgomp.c/target-32.c 2020-10-22 > 14:46:23.211195456 +0200 > @@ -1,5 +1,6 @@ > #include <stdlib.h> > #include <unistd.h> > +#include "usleep.h" > > int main () > { > @@ -18,28 +19,28 @@ int main () > > #pragma omp target nowait map(alloc: b) depend(in: d[2]) depend(out: > d[3]) > { > - usleep (1000); > + tgt_usleep (1000); > #pragma omp atomic update > b |= 4; > } > > #pragma omp target nowait map(alloc: b) depend(in: d[2]) depend(out: > d[4]) > { > - usleep (5000); > + tgt_usleep (5000); > #pragma omp atomic update > b |= 1; > } > > #pragma omp target nowait map(alloc: c) depend(in: d[3], d[4]) > depend(out: d[5]) > { > - usleep (5000); > + tgt_usleep (5000); > #pragma omp atomic update > c |= 8; > } > > #pragma omp target nowait map(alloc: c) depend(in: d[3], d[4]) > depend(out: d[6]) > { > - usleep (1000); > + tgt_usleep (1000); > #pragma omp atomic update > c |= 2; > } > --- gcc/libgomp/testsuite/libgomp.c/thread-limit-2.c.jj 2020-01-12 > 11:54:39.037373820 +0100 > +++ gcc/libgomp/testsuite/libgomp.c/thread-limit-2.c 2020-10-22 > 14:57:31.957516284 +0200 > @@ -4,6 +4,7 @@ > #include <stdlib.h> > #include <unistd.h> > #include <omp.h> > +#include "usleep.h" > > int > main () > @@ -48,7 +49,7 @@ main () > v = ++cnt; > if (v > 6) > abort (); > - usleep (10000); > + tgt_usleep (10000); > #pragma omp atomic > --cnt; > } > > > Jakub >