On 10/6/20 5:02 PM, Jakub Jelinek wrote:
> On Tue, Oct 06, 2020 at 04:48:40PM +0200, Tom de Vries wrote:
>> On 10/5/20 3:15 PM, Tom de Vries wrote:
>>> On 2/7/20 4:29 PM, Jakub Jelinek wrote:
>>>> On Fri, Feb 07, 2020 at 09:56:38AM +0100, Harwath, Frederik wrote:
>>>>> * {target-32.c, thread-limit-2.c}:
>>>>> no "usleep" implemented for nvptx. Cf. https://gcc.gnu.org/PR81690
>>>>
>>>> Please don't, I want to deal with that using declare variant, just didn't
>>>> get yet around to finishing the last patch needed for that.  Will try next 
>>>> week.
>>>>
>>>
>>> Hi Jakub,
>>>
>>> Ping, any update on this?
> 
> Not finished the last step, I run into LTO issues.  Will need to return to
> that soon.
> Last progress in "[RFH] LTO cgraph support for late declare variant 
> resolution"
> mail from May on gcc-patches.
> 

Ack, thanks for the update.

>> --- a/libgomp/testsuite/libgomp.c/target-32.c
>> +++ b/libgomp/testsuite/libgomp.c/target-32.c
>> @@ -1,6 +1,26 @@
>>  #include <stdlib.h>
>>  #include <unistd.h>
>>  
>> +extern void base_delay(int);
> 
> No need to declare this one early.
> 
>> +extern void nvptx_delay(int);
> 
> Space before (, and the definition could go here instead of
> the declaration.
> 
>> +#pragma omp declare variant( nvptx_delay ) match( construct={target}, 
>> implementation={vendor(nvidia)} )
> 
> This isn't the right declare variant for what we want though,
> we only provide gnu as accepted vendor, it is implementation's vendor,
> not vendor of one of the hw components.
> So, it ought to be instead
> #pragma omp declare variant (nvptx_delay) 
> match(construct={target},device={arch(nvptx)})
> 
>> +void base_delay(int d)
>> +{
>> +  usleep (d);
>> +}

I've updated the patch accordingly.

FWIW, I now run into an ICE which looks like PR96680:
...
lto1: internal compiler error: in lto_fixup_prevailing_decls, at
lto/lto-common.c:2595^M
0x93afcd lto_fixup_prevailing_decls^M
        /home/vries/oacc/trunk/source-gcc/gcc/lto/lto-common.c:2595^M
0x93b1d6 lto_fixup_decls^M
        /home/vries/oacc/trunk/source-gcc/gcc/lto/lto-common.c:2645^M
0x93bcc4 read_cgraph_and_symbols(unsigned int, char const**)^M
        /home/vries/oacc/trunk/source-gcc/gcc/lto/lto-common.c:2897^M
0x910358 lto_main()^M
        /home/vries/oacc/trunk/source-gcc/gcc/lto/lto.c:625^M
...

Thanks,
- Tom
diff --git a/libgomp/testsuite/libgomp.c/target-32.c b/libgomp/testsuite/libgomp.c/target-32.c
index 233877b702b..b8deae72b08 100644
--- a/libgomp/testsuite/libgomp.c/target-32.c
+++ b/libgomp/testsuite/libgomp.c/target-32.c
@@ -1,6 +1,25 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+void
+nvptx_delay (int d)
+{
+  /* This function serves as a replacement for usleep in
+     this test case.  It does not even attempt to be functionally
+     equivalent  - we just want some sort of delay. */
+  int i;
+  int N = d * 2000;
+  for (i = 0; i < N; i++)
+    asm volatile ("" : : : "memory");
+}
+
+#pragma omp declare variant (nvptx_delay) match(construct={target},device={arch(nvptx)})
+void
+base_delay(int d)
+{
+  usleep (d);
+}
+
 int main ()
 {
   int a = 0, b = 0, c = 0, d[7];
@@ -18,28 +37,28 @@ int main ()
 
     #pragma omp target nowait map(alloc: b) depend(in: d[2]) depend(out: d[3])
     {
-      usleep (1000);
+      base_delay (1000);
       #pragma omp atomic update
       b |= 4;
     }
 
     #pragma omp target nowait map(alloc: b) depend(in: d[2]) depend(out: d[4])
     {
-      usleep (5000);
+      base_delay (5000);
       #pragma omp atomic update
       b |= 1;
     }
 
     #pragma omp target nowait map(alloc: c) depend(in: d[3], d[4]) depend(out: d[5])
     {
-      usleep (5000);
+      base_delay (5000);
       #pragma omp atomic update
       c |= 8;
     }
 
     #pragma omp target nowait map(alloc: c) depend(in: d[3], d[4]) depend(out: d[6])
     {
-      usleep (1000);
+      base_delay (1000);
       #pragma omp atomic update
       c |= 2;
     }

Reply via email to