Hi!

Jakub, please note question below.

On 2019-11-15T20:11:29+0100, Tobias Burnus <tob...@codesourcery.com> wrote:
> updated version. Changes:
> * Incorporate Thomas's changes
> * Add no_create clause to newly added 'acc serial'
> * Renamed (G)OMP_MAP_NO_ALLOC to (G)OMP_MAP_IF_PRESENT as proposed
> * Make no_create.c effective by adding 'has_firstprivate = true;' to 
> target.c.*

Thanks.

> (* If one tries to access c or e in the no_create-3.{c,f90} run-time 
> test case, plugin-nvidia rightly complains (illegal memory access), 
> using the created 'b' or 'd' works as tested by the test case.

So that's specifically what you fixed above, or is that another problem?

> This 
> feature seems to be also broken on the OG9 branch.)

Not surprising, given the insufficient testsuite coverage...  ;'-|

I note that you've not addressed the other TODO items that I had put into
the libgomp memory mapping code (see below for reference).  I still think
that this should be understood better, that the code as currently
proposed/discussed is "too complex".  I have an idea how to do this
differently (easier?), but I still have to sketch that out, and not sure
when I'll get to that.  I'm willing to accept that patch as-is, unless
Jakub has any further comments at this point.


Another thing: I've added just another little bit of testsuite coverage,
and another thing broke.  See "TODO" in attached incremental patch.
(Please rename the files appropriately.)  Please have a look.

This feels like something going wrong in gimplification, when we "Look in
outer OpenACC contexts, to see if there's a data attribute for this
variable" ('gcc/gimplify.c:omp_notice_variable'), but that's just a wild
guess.  If you agree/understand that there is a problem, and add some
XFAILed 'gimple' tree-scanning test cases (maybe even just to the libgomp
test cases that I've added), I'm fine to accept that XFAILed, to be
resolved later.

Maybe even that's not specific to the 'no_create' clause, just doesn't
cause any harm (given the existing testsuite...) for other OpenACC
constructs/clauses?


The incremental Fortran test case changes have bene done in a rush; not
sure if they make much sense, or should see some further work applied to
them.


With these items considered/addressed as you feel comfortable, this is OK
for trunk.  To record the review effort, please include "Reviewed-by:
Thomas Schwinge <tho...@codesourcery.com>" in the commit log, see
<https://gcc.gnu.org/wiki/Reviewed-by>.


> PS: Remaining bits of the OG9 patch, which are not included are the 
> following. I think those are all attach/detach features: a test case for 
> "no_create(s.y…)" (i.e. the struct component-ref; 
> libgomp/testsuite/libgomp.oacc-c-c++-common/nocreate-{3,4}.c) and some 
> 'do_detach = false' in libgomp/target.c. Cf. openacc-gcc-9 /…-8 branch 
> patch is commit 8e74c2ec2b90819c995444370e742864a685209f of Dec 20, 
> 2018. It has been posted as 
> https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01418.html


The libgomp memory mapping code:

> Add OpenACC 2.6 `no_create' clause support
>
> The clause makes any device code use the local memory address for each
> of the variables specified unless the given variable is already present
> on the current device.

> --- a/include/gomp-constants.h
> +++ b/include/gomp-constants.h
> @@ -75,6 +75,8 @@ enum gomp_map_kind
>      GOMP_MAP_DEVICE_RESIDENT =               (GOMP_MAP_FLAG_SPECIAL_1 | 1),
>      /* OpenACC link.  */
>      GOMP_MAP_LINK =                  (GOMP_MAP_FLAG_SPECIAL_1 | 2),
> +    /* Use device data if present, fall back to host address otherwise.  */
> +    GOMP_MAP_IF_PRESENT =                    (GOMP_MAP_FLAG_SPECIAL_1 | 3),
>      /* Do not map, copy bits for firstprivate instead.  */
>      GOMP_MAP_FIRSTPRIVATE =          (GOMP_MAP_FLAG_SPECIAL | 0),
>      /* Similarly, but store the value in the pointer rather than

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -667,6 +667,13 @@ gomp_map_vars_internal (struct gomp_device_descr 
> *devicep,
>         has_firstprivate = true;
>         continue;
>       }
> +      else if ((kind & typemask) == GOMP_MAP_IF_PRESENT)
> +     {
> +       tgt->list[i].key = NULL;
> +       tgt->list[i].offset = 0;
> +       has_firstprivate = true;
> +       continue;
> +     }
>        cur_node.host_start = (uintptr_t) hostaddrs[i];
>        if (!GOMP_MAP_POINTER_P (kind & typemask))
>       cur_node.host_end = cur_node.host_start + sizes[i];
> @@ -892,6 +899,49 @@ gomp_map_vars_internal (struct gomp_device_descr 
> *devicep,
>               cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
>                                     + cur_node.host_start - n->host_start;
>               continue;
> +           case GOMP_MAP_IF_PRESENT:
> +             {
> +               cur_node.host_start = (uintptr_t) hostaddrs[i];
> +               cur_node.host_end = cur_node.host_start + sizes[i];
> +               splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
> +               if (n != NULL)
> +                 {
> +                   tgt->list[i].key = n;
> +                   tgt->list[i].offset = cur_node.host_start - n->host_start;
> +                   tgt->list[i].length = n->host_end - n->host_start;
> +                   tgt->list[i].copy_from = false;
> +                   tgt->list[i].always_copy_from = false;
> +                   n->refcount++;
> +                 }
> +               else
> +                 {
> +                   tgt->list[i].key = NULL;
> +                   tgt->list[i].offset = OFFSET_INLINED;
> +                   tgt->list[i].length = sizes[i];
> +                   tgt->list[i].copy_from = false;
> +                   tgt->list[i].always_copy_from = false;
> +                   if (i + 1 < mapnum)
> +                     {
> +                       int kind2 = get_kind (short_mapkind, kinds, i + 1);
> +                       switch (kind2 & typemask)
> +                         {
> +                         case GOMP_MAP_POINTER:
> +                           /* The data is not present but we have an attach
> +                              or pointer clause next.  Skip over it.  */
> +                           i++;
> +                           tgt->list[i].key = NULL;
> +                           tgt->list[i].offset = OFFSET_INLINED;
> +                           tgt->list[i].length = sizes[i];
> +                           tgt->list[i].copy_from = false;
> +                           tgt->list[i].always_copy_from = false;
> +                           break;
> +                         default:
> +                           break;
> +                         }
> +                     }
> +                 }
> +               continue;
> +             }
>             default:
>               break;
>             }

My TODO items:

--- libgomp/target.c
+++ libgomp/target.c
@@ -671,6 +671,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
        }
       else if ((kind & typemask) == GOMP_MAP_IF_PRESENT)
        {
+         //TODO TS is confused.  Handling this here, will inhibit 
'gomp_map_vars_existing' being used a bit further below.
          tgt->list[i].key = NULL;
          tgt->list[i].offset = 0;
          has_firstprivate = true;
@@ -908,6 +910,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
                  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
                  if (n != NULL)
                    {
+                     //TODO TS is confused.  Due to the way the handling of 
'GOMP_MAP_NO_ALLOC' is done in the first loop, we're here re-doing 
'gomp_map_vars_existing'?
                      tgt->list[i].key = n;
                      tgt->list[i].offset = cur_node.host_start - n->host_start;
                      tgt->list[i].length = n->host_end - n->host_start;
@@ -917,6 +920,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
                    }
                  else
                    {
+                     //TODO This is basically 'GOMP_MAP_FIRSTPRIVATE_INT' 
handling?
                      tgt->list[i].key = NULL;
                      tgt->list[i].offset = OFFSET_INLINED;
                      tgt->list[i].length = sizes[i];
@@ -928,6 +932,11 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
                          switch (kind2 & typemask)
                            {
                            case GOMP_MAP_POINTER:
+                             //TODO abort();
+                             //TODO This code path is exercised by 
'libgomp.oacc-fortran/no_create-2.f90'.
+                             //TODO TS does not yet understand why this is 
needed.
+                             //TODO Is this somehow similar to 
'GOMP_MAP_TO_PSET' handling?
+
                              /* The data is not present but we have an attach
                                 or pointer clause next.  Skip over it.  */
                              i++;


Grüße
 Thomas


From 9a46a8af6374d248c77d6834efaff971da10ecbe Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tho...@codesourcery.com>
Date: Mon, 2 Dec 2019 12:53:17 +0100
Subject: [PATCH] Add OpenACC 2.6 `no_create' clause support: some more testing

---
 .../libgomp.oacc-c-c++-common/no_create-1.c   | 27 ++++--
 .../libgomp.oacc-c-c++-common/no_create-1_.c  | 82 +++++++++++++++++++
 .../libgomp.oacc-c-c++-common/no_create-2.c   | 18 ++--
 .../libgomp.oacc-c-c++-common/no_create-2_.c  | 49 +++++++++++
 .../libgomp.oacc-fortran/no_create-1.f90      | 24 +++---
 .../libgomp.oacc-fortran/no_create-2.f90      | 47 +++++++----
 6 files changed, 206 insertions(+), 41 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1_.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2_.c

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c
index c7a1bd9c015..22e0c20cce9 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c
@@ -1,4 +1,5 @@
-/* Test no_create clause when data is present on the device.  */
+/* Test 'no_create' clause on compute construct, with data present on the
+   device.  */
 
 #include <stdlib.h>
 #include <stdio.h>
@@ -9,28 +10,36 @@
 int
 main (int argc, char *argv[])
 {
+  int var;
   int *arr = (int *) malloc (N * sizeof (*arr));
-  int *devptr;
+  int *devptr[2];
 
+  acc_copyin (&var, sizeof (var));
   acc_copyin (arr, N * sizeof (*arr));
 
-  #pragma acc parallel no_create(arr[0:N]) copyout(devptr)
+#pragma acc parallel no_create(var, arr[0:N]) copyout(devptr)
   {
-    devptr = &arr[2];
+    devptr[0] = &var;
+    devptr[1] = &arr[2];
   }
 
-#if !ACC_MEM_SHARED
-  if (acc_hostptr (devptr) != (void *) &arr[2])
+  if (acc_hostptr (devptr[0]) != (void *) &var)
+    __builtin_abort ();
+  if (acc_hostptr (devptr[1]) != (void *) &arr[2])
     __builtin_abort ();
-#endif
 
+  acc_delete (&var, sizeof (var));
   acc_delete (arr, N * sizeof (*arr));
 
 #if ACC_MEM_SHARED
-  if (&arr[2] != devptr)
+  if (devptr[0] != &var)
+    __builtin_abort ();
+  if (devptr[1] != &arr[2])
     __builtin_abort ();
 #else
-  if (&arr[2] == devptr)
+  if (devptr[0] == &var)
+    __builtin_abort ();
+  if (devptr[1] == &arr[2])
     __builtin_abort ();
 #endif
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1_.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1_.c
new file mode 100644
index 00000000000..963cb3a68f6
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1_.c
@@ -0,0 +1,82 @@
+/* Test 'no_create' clause on 'data' construct and nested compute construct,
+   with data present on the device.  */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <openacc.h>
+
+#define N 128
+
+int
+main (int argc, char *argv[])
+{
+  int var;
+  int *arr = (int *) malloc (N * sizeof (*arr));
+  int *devptr[2];
+
+  acc_copyin (&var, sizeof (var));
+  acc_copyin (arr, N * sizeof (*arr));
+
+#pragma acc data no_create(var, arr[0:N])
+  {
+    devptr[0] = (int *) acc_deviceptr (&var);
+    devptr[1] = (int *) acc_deviceptr (&arr[2]);
+
+    if (devptr[0] == NULL)
+      __builtin_abort ();
+    if (devptr[1] == NULL)
+      __builtin_abort ();
+
+    if (acc_hostptr (devptr[0]) != (void *) &var)
+      __builtin_abort ();
+    if (acc_hostptr (devptr[1]) != (void *) &arr[2])
+      __builtin_abort ();
+
+#if ACC_MEM_SHARED
+    if (devptr[0] != &var)
+      __builtin_abort ();
+    if (devptr[1] != &arr[2])
+      __builtin_abort ();
+#else
+    if (devptr[0] == &var)
+      __builtin_abort ();
+    if (devptr[1] == &arr[2])
+      __builtin_abort ();
+#endif
+
+#pragma acc parallel copyout(devptr)
+    {
+      devptr[0] = &var;
+      devptr[1] = &arr[2];
+    }
+
+    if (devptr[0] == NULL)
+      __builtin_abort ();
+    if (devptr[1] == NULL)
+      __builtin_abort ();
+
+    if (acc_hostptr (devptr[0]) != (void *) &var)
+      __builtin_abort ();
+    if (acc_hostptr (devptr[1]) != (void *) &arr[2])
+      __builtin_abort ();
+
+#if ACC_MEM_SHARED
+    if (devptr[0] != &var)
+      __builtin_abort ();
+    if (devptr[1] != &arr[2])
+      __builtin_abort ();
+#else
+    if (devptr[0] == &var)
+      __builtin_abort ();
+    if (devptr[1] == &arr[2])
+      __builtin_abort ();
+#endif
+  }
+
+  acc_delete (&var, sizeof (var));
+  acc_delete (arr, N * sizeof (*arr));
+
+  free (arr);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c
index 2964a40b217..fbd01a25956 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c
@@ -1,4 +1,5 @@
-/* Test no_create clause when data is not present on the device.  */
+/* Test 'no_create' clause on compute construct, with data not present on the
+   device.  */
 
 #include <stdlib.h>
 #include <stdio.h>
@@ -8,18 +9,19 @@
 int
 main (int argc, char *argv[])
 {
+  int var;
   int *arr = (int *) malloc (N * sizeof (*arr));
-  int *devptr;
+  int *devptr[2];
 
-  #pragma acc data no_create(arr[0:N])
+#pragma acc parallel no_create(var, arr[0:N]) copyout(devptr)
   {
-    #pragma acc parallel copyout(devptr)
-    {
-      devptr = &arr[2];
-    }
+    devptr[0] = &var;
+    devptr[1] = &arr[2];
   }
 
-  if (devptr != &arr[2])
+  if (devptr[0] != &var)
+    __builtin_abort ();
+  if (devptr[1] != &arr[2])
     __builtin_abort ();
 
   free (arr);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2_.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2_.c
new file mode 100644
index 00000000000..6f0ace501cf
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2_.c
@@ -0,0 +1,49 @@
+/* Test 'no_create' clause on 'data' construct and nested compute construct,
+   with data not present on the device.  */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <openacc.h>
+
+#define N 128
+
+int
+main (int argc, char *argv[])
+{
+  int var;
+  int *arr = (int *) malloc (N * sizeof (*arr));
+  int *devptr[2];
+
+#pragma acc data no_create(var, arr[0:N])
+  {
+    devptr[0] = (int *) acc_deviceptr (&var);
+    devptr[1] = (int *) acc_deviceptr (&arr[2]);
+
+#if ACC_MEM_SHARED
+    if (devptr[0] == NULL)
+      __builtin_abort ();
+    if (devptr[1] == NULL)
+      __builtin_abort ();
+#else
+    if (devptr[0] != NULL)
+      __builtin_abort ();
+    if (devptr[1] != NULL)
+      __builtin_abort ();
+#endif
+
+#pragma acc parallel copyout(devptr) // TODO implicit 'copy(var)' -- huh?!
+    {
+      devptr[0] = &var;
+      devptr[1] = &arr[2];
+    }
+
+    if (devptr[0] != &var)
+      __builtin_abort (); // { dg-xfail-run-if "TODO" { *-*-* } { "-DACC_MEM_SHARED=0" } }
+    if (devptr[1] != &arr[2])
+      __builtin_abort ();
+  }
+
+  free (arr);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/no_create-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/no_create-1.f90
index ca9611b777c..4a1d5da98aa 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/no_create-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/no_create-1.f90
@@ -2,12 +2,12 @@
 
 ! Test no_create clause with data construct when data is present/not present.
 
-program nocreate
+program no_create
   use openacc
   implicit none
   logical :: shared_memory
   integer, parameter :: n = 512
-  integer :: myarr(n)
+  integer :: myvar, myarr(n)
   integer i
 
   shared_memory = .false.
@@ -15,21 +15,25 @@ program nocreate
   shared_memory = .true.
   !$acc end kernels
 
+  myvar = 77
   do i = 1, n
     myarr(i) = 0
   end do
 
-  !$acc data no_create (myarr)
-  if (acc_is_present (myarr) .neqv. shared_memory) stop 1
+  !$acc data no_create (myvar, myarr)
+  if (acc_is_present (myvar) .neqv. shared_memory) stop 10
+  if (acc_is_present (myarr) .neqv. shared_memory) stop 11
   !$acc end data
 
-  !$acc enter data copyin (myarr)
-  !$acc data no_create (myarr)
-  if (acc_is_present (myarr) .eqv. .false.) stop 2
+  !$acc enter data copyin (myvar, myarr)
+  !$acc data no_create (myvar, myarr)
+  if (acc_is_present (myvar) .eqv. .false.) stop 20
+  if (acc_is_present (myarr) .eqv. .false.) stop 21
   !$acc end data
-  !$acc exit data copyout (myarr)
+  !$acc exit data copyout (myvar, myarr)
 
+  if (myvar .ne. 77) stop 30
   do i = 1, n
-    if (myarr(i) .ne. 0) stop 3
+    if (myarr(i) .ne. 0) stop 31
   end do
-end program nocreate
+end program no_create
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/no_create-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/no_create-2.f90
index 16227b8ae22..0b11f454aca 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/no_create-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/no_create-2.f90
@@ -2,12 +2,12 @@
 
 ! Test no_create clause with data/parallel constructs.
 
-program nocreate
+program no_create
   use openacc
   implicit none
   logical :: shared_memory
   integer, parameter :: n = 512
-  integer :: myarr(n)
+  integer :: myvar, myarr(n)
   integer i
 
   shared_memory = .false.
@@ -15,47 +15,66 @@ program nocreate
   shared_memory = .true.
   !$acc end kernels
 
+  myvar = 55
   do i = 1, n
     myarr(i) = 0
   end do
 
-  call do_on_target(myarr, n)
+  call do_on_target(myvar, n, myarr)
 
+  if (shared_memory) then
+     if (myvar .ne. 44) stop 10
+  else
+     if (myvar .ne. 33) stop 11
+  end if
   do i = 1, n
     if (shared_memory) then
-      if (myarr(i) .ne. i * 2) stop 1
+      if (myarr(i) .ne. i * 2) stop 20
     else
-      if (myarr(i) .ne. i) stop 2
+      if (myarr(i) .ne. i) stop 21
     end if
   end do
 
+  myvar = 55
   do i = 1, n
     myarr(i) = 0
   end do
 
-  !$acc enter data copyin(myarr)
-  call do_on_target(myarr, n)
-  !$acc exit data copyout(myarr)
+  !$acc enter data copyin(myvar, myarr)
+  call do_on_target(myvar, n, myarr)
+  !$acc exit data copyout(myvar, myarr)
 
+  if (myvar .ne. 44) stop 30
   do i = 1, n
-    if (myarr(i) .ne. i * 2) stop 3
+    if (myarr(i) .ne. i * 2) stop 31
   end do
-end program nocreate
+end program no_create
 
-subroutine do_on_target (arr, n)
+subroutine do_on_target (var, n, arr)
   use openacc
   implicit none
-  integer :: n, arr(n)
+  integer :: var, n, arr(n)
   integer :: i
 
-!$acc data no_create (arr)
+!$acc data no_create (var, arr)
 
+if (acc_is_present(var)) then
+  ! The no_create clause is meant for partially shared-memory machines.  This
+  ! test is written to work on non-shared-memory machines, though this is not
+  ! necessarily a useful way to use the no_create clause in practice.
+
+  !$acc parallel !no_create (var)
+   var = 44
+  !$acc end parallel
+else
+   var = 33
+end if
 if (acc_is_present(arr)) then
   ! The no_create clause is meant for partially shared-memory machines.  This
   ! test is written to work on non-shared-memory machines, though this is not
   ! necessarily a useful way to use the no_create clause in practice.
 
-  !$acc parallel loop no_create (arr)
+  !$acc parallel loop !no_create (arr)
   do i = 1, n
     arr(i) = i * 2
   end do
-- 
2.17.1

Attachment: signature.asc
Description: PGP signature

Reply via email to