Hi! On Tue, 9 Jun 2020 12:41:21 +0200 Thomas Schwinge <tho...@codesourcery.com> wrote:
> Hi Julian! > > On 2020-06-05T21:31:08+0100, Julian Brown <jul...@codesourcery.com> > wrote: > > On Fri, 5 Jun 2020 13:17:09 +0200 > > Thomas Schwinge <tho...@codesourcery.com> wrote: > >> On 2019-12-17T21:03:47-0800, Julian Brown <jul...@codesourcery.com> > >> wrote: > >> > This part contains the libgomp runtime support for the > >> > GOMP_MAP_ATTACH and GOMP_MAP_DETACH mapping kinds > >> > >> > --- a/libgomp/target.c > >> > +++ b/libgomp/target.c > >> > >> > @@ -1203,6 +1211,32 @@ gomp_map_vars_internal (struct > >> > gomp_device_descr *devicep, > >> > >> > + case GOMP_MAP_ATTACH: > >> > + { > >> > + cur_node.host_start = (uintptr_t) > >> > hostaddrs[i]; > >> > + cur_node.host_end = cur_node.host_start + > >> > sizeof (void *); > >> > + splay_tree_key n = splay_tree_lookup > >> > (mem_map, &cur_node); > >> > + if (n != NULL) > >> > + { > >> > + tgt->list[i].key = n; > >> > + tgt->list[i].offset = cur_node.host_start > >> > - n->host_start; > >> > + tgt->list[i].length = n->host_end - > >> > n->host_start; > >> > + tgt->list[i].copy_from = false; > >> > + tgt->list[i].always_copy_from = false; > >> > + tgt->list[i].do_detach > >> > + = (pragma_kind != > >> > GOMP_MAP_VARS_OPENACC_ENTER_DATA); > >> > + n->refcount++; > >> > + } > >> > + else > >> > + { > >> > + gomp_mutex_unlock (&devicep->lock); > >> > + gomp_fatal ("outer struct not mapped for > >> > attach"); > >> > + } > >> > + gomp_attach_pointer (devicep, aq, mem_map, n, > >> > + (uintptr_t) > >> > hostaddrs[i], sizes[i], > >> > + cbufp); > >> > + continue; > >> > + } > >> > >> For the OpenACC runtime API 'acc_attach' etc. routines they don't, > >> so what's the conceptual reason that for the corresponding OpenACC > >> directive variants, 'GOMP_MAP_ATTACH' etc. here participate in > >> reference counting ('n->refcount++' above)? I understand OpenACC > >> 'attach'/'detach' clauses to be simple "executable clauses", which > >> just update some values somewhere (say, like > >> 'GOMP_MAP_ALWAYS_POINTER'), but they don't alter any mapping state, > >> thus wouldn't appear to need reference counting? > > > > IIUC, n->refcount is not directly the "structural reference count" > > as seen at source level, but rather counts the number of > > target_var_descs in the lists appended to each target_mem_desc -- > > and GOMP_MAP_ATTACH have variable entries in those lists. > > That may be OK if that's purely an implementation detail that isn't > visible to the user, however: > > > That's not the case for the API > > routines. > > As I had mentioned, the problem is: in contrast to 'acc_attach', an > OpenACC 'enter data' directive with 'attach' clause currently uses > this same reference-counted code path, and thus such an 'attach' > without corresponding 'detach' inhibits unmapping; [...] The attached patch stops attach/detach operations from affecting reference counts (either structured or dynamic). This isn't as invasive as I'd imagined: we can extend the use of the "do_detach" flag in target_mem_descs' variable lists to mark mappings that correspond to attach operations, then use that flag to avoid refcount increment/decrements. (The flag should possibly be renamed now.) I've modified the refcount self-testing code successfully to work with this new scheme too, in case that's helpful. I'll send the patches for that separately. Tested with offloading to NVPTX. OK? Thanks, Julian ChangeLog libgomp/ * oacc-mem.c (goacc_enter_data_internal): Don't affect reference counts for attach mappings. (goacc_exit_data_internal): Don't affect reference counts for detach mappings. * target.c (gomp_map_vars_existing): Don't affect reference counts for attach mappings. (gomp_map_vars_internal): Set do_detach flag unconditionally to mark attach mappings. (gomp_unmap_vars_internal): Use above flag to prevent affecting reference count for attach mappings. * testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-2.c: Likewise. * testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Mark test as shouldfail. * testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Adjust to fail gracefully in no-finalize mode.
commit e5fd3efda7f176f035c5ed2e3095d4a49a780864 Author: Julian Brown <jul...@codesourcery.com> Date: Thu Jun 18 05:11:08 2020 -0700 [OpenACC] Deep copy attach/detach should not affect reference counts libgomp/ * oacc-mem.c (goacc_enter_data_internal): Don't affect reference counts for attach mappings. (goacc_exit_data_internal): Don't affect reference counts for detach mappings. * target.c (gomp_map_vars_existing): Don't affect reference counts for attach mappings. (gomp_map_vars_internal): Set do_detach flag unconditionally to mark attach mappings. (gomp_unmap_vars_internal): Use above flag to prevent affecting reference count for attach mappings. * testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-2.c: Likewise. * testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Mark test as shouldfail. * testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Adjust to fail gracefully in no-finalize mode. diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index f852652c048..9bb5887fc5e 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -1095,8 +1095,11 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, /* A standalone attach clause. */ if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH) - gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, - (uintptr_t) h, s, NULL); + { + gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, + (uintptr_t) h, s, NULL); + continue; + } else if (h + s > (void *) n->host_end) { gomp_mutex_unlock (&acc_dev->lock); @@ -1131,7 +1134,9 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, if (tgt->list[j].key == n) { for (size_t k = 0; k < groupnum; k++) - if (j + k < tgt->list_count && tgt->list[j + k].key) + if (j + k < tgt->list_count + && tgt->list[j + k].key + && !tgt->list[j + k].do_detach) { tgt->list[j + k].key->refcount++; tgt->list[j + k].key->dynamic_refcount++; @@ -1156,7 +1161,7 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, for (size_t j = 0; j < tgt->list_count; j++) { n = tgt->list[j].key; - if (n) + if (n && !tgt->list[j].do_detach) n->dynamic_refcount++; } } @@ -1265,14 +1270,10 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, case GOMP_MAP_POINTER: case GOMP_MAP_DELETE: case GOMP_MAP_RELEASE: - case GOMP_MAP_DETACH: - case GOMP_MAP_FORCE_DETACH: { struct splay_tree_key_s cur_node; size_t size; - if (kind == GOMP_MAP_POINTER - || kind == GOMP_MAP_DETACH - || kind == GOMP_MAP_FORCE_DETACH) + if (kind == GOMP_MAP_POINTER) size = sizeof (void *); else size = sizes[i]; @@ -1339,6 +1340,11 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, 'GOMP_MAP_STRUCT's anymore. */ break; + case GOMP_MAP_DETACH: + case GOMP_MAP_FORCE_DETACH: + /* These are no-ops here: handled above. */ + break; + default: gomp_fatal (">>>> goacc_exit_data_internal UNHANDLED kind 0x%.2x", kind); diff --git a/libgomp/target.c b/libgomp/target.c index 3f2becdae0e..406a1e39d98 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -382,7 +382,7 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep, (void *) newn->host_start, newn->host_end - newn->host_start, cbuf); - if (oldn->refcount != REFCOUNT_INFINITY) + if (oldn->refcount != REFCOUNT_INFINITY && kind != GOMP_MAP_ATTACH) oldn->refcount++; } @@ -1092,9 +1092,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, tgt->list[i].length = n->host_end - n->host_start; tgt->list[i].copy_from = false; tgt->list[i].always_copy_from = false; - tgt->list[i].do_detach - = (pragma_kind != GOMP_MAP_VARS_ENTER_DATA); - n->refcount++; + tgt->list[i].do_detach = true; } else { @@ -1442,7 +1440,7 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom, for (i = 0; i < tgt->list_count; i++) { splay_tree_key k = tgt->list[i].key; - if (k == NULL) + if (k == NULL || tgt->list[i].do_detach) continue; bool do_unmap = false; diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-1.c new file mode 100644 index 00000000000..bb4d95310e6 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-1.c @@ -0,0 +1,50 @@ +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ + +#include <openacc.h> +#include <assert.h> + +#define N 1024 + +struct mystr { + int pad; + int *data; +}; + +int +main (int argc, char *argv[]) +{ + int arr[N]; + struct mystr s; + + s.data = arr; + + acc_copyin (&s, sizeof (s)); + acc_create (s.data, N * sizeof (int)); + + for (int i = 0; i < 20; i++) + { +#ifdef ATTACH_VIA_DIRECTIVE + #pragma acc enter data attach(s.data) + + acc_detach ((void **) &s.data); +#else + acc_attach ((void **) &s.data); + + #pragma acc exit data detach(s.data) +#endif + } + + assert (acc_is_present (arr, N * sizeof (int))); + assert (acc_is_present (&s, sizeof (s))); + + acc_delete (arr, N * sizeof (int)); + + assert (!acc_is_present (arr, N * sizeof (int))); + + acc_copyout (&s, sizeof (s)); + + assert (!acc_is_present (&s, sizeof (s))); + assert (s.data == arr); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-2.c new file mode 100644 index 00000000000..6b5371f0e48 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-2.c @@ -0,0 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ +/* { dg-additional-options "-DATTACH_VIA_DIRECTIVE" } */ + +#include "attach-detach-rc-1.c" diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 index ed4f10e7a3f..ad8da71d7c9 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 @@ -1,8 +1,14 @@ ! { dg-do run } +! { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } /* Nullify the 'finalize' clause, which disturbs reference counting. */ #define finalize #include "deep-copy-6.f90" ! { dg-output ".*CheCKpOInT1(\n|\r\n|\r)" } -! { dg-output ".CheCKpOInT2(\n|\r\n|\r)" } +! { dg-output ".*CheCKpOInT2(\n|\r\n|\r)" } + +! Without the finalize, we do not detach properly so the host sees a device +! pointer, and fails with this STOP code. +! { dg-output "STOP 7(\n|\r\n|\r)+" } +! { dg-shouldfail "" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 index eb7d3ca160e..1a291c17241 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 @@ -12,11 +12,14 @@ program dtype end type mytype integer i - type(mytype) :: var + type(mytype), target :: var + integer, pointer :: hostptr(:) allocate(var%a(1:n)) allocate(var%b(1:n)) + hostptr => var%a + !$acc data copy(var) do i = 1, n @@ -53,6 +56,8 @@ program dtype !$acc end data + if (.not. associated(hostptr, var%a)) stop 7 + do i = 1,4 if (var%a(i) .ne. 0) stop 1 if (var%b(i) .ne. 0) stop 2