Hi Julian, Tobias! On 2020-07-15T12:28:42+0200, Thomas Schwinge <tho...@codesourcery.com> wrote: > On 2020-07-14T13:43:37+0200, I wrote: >> On 2020-06-16T15:39:44-0700, Julian Brown <jul...@codesourcery.com> wrote: >>> As mentioned in the blurb for the previous patch, an "attach" operation >>> for a Fortran pointer with an array descriptor must copy that array >>> descriptor to the target. >> >> Heh, I see -- I don't think I had read the OpenACC standard in that way, >> but I think I agree your interpretation is fine. >> >> This does not create some sort of memory leak -- everything implicitly >> allocated there will eventually be deallocated again, right?
Unanswered -- but I may now have found this problem, and also found "the reverse problem" ('finalize'); see below. >>> This patch arranges for that to be so. >> >> In response to the new OpenACC/Fortran testcase that I'd submtited in >> <87wo3co0tm.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87wo3co0tm.fsf@euler.schwinge.homeip.net>, >> you (Julian) correctly supposed in >> <http://mid.mail-archive.com/20200709223246.23a4d0e0@squid.athome>, that >> this patch indeed does resolve that testcase, too. That wasn't obvious >> to me. So, similar to >> 'libgomp/testsuite/libgomp.oacc-c-c++-common/pr95270-{1.2}.c', please >> include my new OpenACC/Fortran testcase (if that makes sense to you), and >> reference PR95270 in the commit log. > > My new OpenACC/Fortran testcase got again broken ('libgomp: pointer > target not mapped for attach') by Tobias' commit > 102502e32ea4e8a75d6b252ba319d09d735d9aa7 "[OpenMP, Fortran] Add > structure/derived-type element mapping", > <c5b43e02-d1d5-e7cf-c11c-6daf1e8f33c5@codesourcery.com">http://mid.mail-archive.com/c5b43e02-d1d5-e7cf-c11c-6daf1e8f33c5@codesourcery.com>. > > Similar ('libgomp: attempt to attach null pointer') for your new > 'libgomp.oacc-fortran/attach-descriptor-1.f90'. > > (Whether or not 'attach'ing 'NULL' should actually be allowed, is a > separate topic for discussion.) > > So this patch here will (obviously) need to be adapted to what Tobias > changed. I see what you pushed in commit 39dda0020801045d9a604575b2a2593c05310015 "openacc: Fix standalone attach for Fortran assumed-shape array pointers" indeed has become much smaller/simpler. :-) (But, (parts of?) Tobias' commit mentioned above (plus commit 524862db444b6544c6dc87c5f06f351100ecf50d "Fix goacc/finalize-1.f tree dump-scanning for -m32", if applicable) will then also need to be backported to releases/gcc-10 branch (once un-frozen).) > (Plus my more general questions quoted above and below.) >>> OK? >> >> Basically yes (for master and releases/gcc-10 branches), but please >> consider the following: >> >>> --- a/gcc/fortran/trans-openmp.c >>> +++ b/gcc/fortran/trans-openmp.c >>> @@ -2573,8 +2573,44 @@ gfc_trans_omp_clauses (stmtblock_t *block, >>> gfc_omp_clauses *clauses, >>> } >>> } >>> if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)) >>> - && n->u.map_op != OMP_MAP_ATTACH >>> - && n->u.map_op != OMP_MAP_DETACH) >>> + && (n->u.map_op == OMP_MAP_ATTACH >>> + || n->u.map_op == OMP_MAP_DETACH)) >>> + { >>> + tree type = TREE_TYPE (decl); >>> + tree data = gfc_conv_descriptor_data_get (decl); >>> + if (present) >>> + data = gfc_build_cond_assign_expr (block, present, >>> + data, >>> + null_pointer_node); >>> + tree ptr >>> + = fold_convert (build_pointer_type (char_type_node), >>> + data); >>> + ptr = build_fold_indirect_ref (ptr); >>> + /* Standalone attach clauses used with arrays with >>> + descriptors must copy the descriptor to the target, >>> + else they won't have anything to perform the >>> + attachment onto (see OpenACC 2.6, "2.6.3. Data >>> + Structures with Pointers"). */ >>> + OMP_CLAUSE_DECL (node) = ptr; >>> + node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP); >>> + OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET); >>> + OMP_CLAUSE_DECL (node2) = decl; >>> + OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); >>> + node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); >>> + if (n->u.map_op == OMP_MAP_ATTACH) >>> + { >>> + OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_ATTACH); >>> + n->u.map_op = OMP_MAP_ALLOC; >>> + } >>> + else /* OMP_MAP_DETACH. */ >>> + { >>> + OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_DETACH); >>> + n->u.map_op = OMP_MAP_RELEASE; >>> + } >>> + OMP_CLAUSE_DECL (node3) = data; >>> + OMP_CLAUSE_SIZE (node3) = size_int (0); >>> + } >> >> So this ("case A") duplicates most of the code from... >> >>> + else if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))) >>> { >>> [...] >> >> ... this existing case here ("case B"). It's not clear to me if these >> two cases really still need to be handled separately, and a little bit >> differently (regarding 'if (present)' handling, for example), or if they >> could/should (?) be merged? Tobias, do you have an opinion? (These have been merged.) >> Do we have sufficient testsuite coverage? (For example, >> 'attach'/'detach' with 'present == false', if that makes sense, or any >> other thing that case A is doing differently from case B?) (I'm not sure we're actually testing all relevant cases.) >> Shouldn't >> this get '-fdump-tree-original' and/or '-fdump-tree-gimple' testcases, >> similar to 'gfortran.dg/goacc/finalize-1.f', so that we verify/document >> what we generate here? So I guess I had -- unconsciously? ;-) -- mentioned -fdump-tree-gimple' and 'gfortran.dg/goacc/finalize-1.f' for a reason. That displays how the 'finalize' clause is implemented (see WIP patch attached, 'gfortran.dg/goacc/attach-descriptor.f90'), and... >>> --- /dev/null >>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 >>> @@ -0,0 +1,51 @@ >>> +program att >> >> Please add 'dg-do run', and... >> >>> + use openacc >>> + implicit none >>> + type t >>> + integer :: arr1(10) >>> + integer, allocatable :: arr2(:) >>> + end type t >>> + integer :: i >>> + type(t) :: myvar >>> + integer, target :: tarr(10) >>> + integer, pointer :: myptr(:) >>> + >>> + allocate(myvar%arr2(10)) >>> + >>> + do i=1,10 >>> + myvar%arr1(i) = 0 >>> + myvar%arr2(i) = 0 >>> + tarr(i) = 0 >>> + end do >>> + >>> + call acc_copyin(myvar) >>> + call acc_copyin(myvar%arr2) >>> + call acc_copyin(tarr) >>> + >>> + myptr => tarr >>> + >>> + !$acc enter data attach(myvar%arr2, myptr) >>> + >>> + ! FIXME: This warning is emitted on the wrong line number. >>> + ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target >>> openacc_nvidia_accel_selected } 36 } >> >> ... don't forget to adjust "36" here. ;-) >> >>> + !$acc serial present(myvar%arr2) >>> + do i=1,10 >>> + myvar%arr1(i) = i >>> + myvar%arr2(i) = i >>> + end do >>> + myptr(3) = 99 >>> + !$acc end serial >>> + >>> + !$acc exit data detach(myvar%arr2, myptr) ..., if we here 'detach' with 'finalize' added, that will turn into a 'delete' (instead of 'release') of 'myptr => tarr', and thus... >>> + >>> + call acc_copyout(myvar%arr2) >>> + call acc_copyout(myvar) >>> + call acc_copyout(tarr) >>> + >>> + do i=1,10 >>> + if (myvar%arr1(i) .ne. i) stop 1 >>> + if (myvar%arr2(i) .ne. i) stop 2 >>> + end do >>> + if (tarr(3) .ne. 99) stop 3 ..., here we won't see the updated 'tarr(3) == 99', and fail. >>> + >>> +end program att Alternativly, we can show the problem with 'acc_is_present', as in my WIP patch attached, 'libgomp.oacc-fortran/attach-descriptor-1__.f90'. (But when experimenting with 'acc_is_present' and Fortran 'pointer's, beware of PR96080 "OpenACC/Fortran runtime library routines vs. Fortran 'pointer'".) What should happen in this case? Do we agree that 'exit data detach(myptr)' should *never* unmap 'myptr => tarr', but really should just unmap the 'myptr' array descriptor? We can add special handling so that for standalone 'detach', a 'finalize' doesn't turn 'release' into 'delete', but that doesn't feel like the correct solution. Also, we have a different -- bigger? -- problem: if we, for example, 'attach(myptr)' twice, that operation will include twice times incrementing the reference count of 'myptr => tarr', and that'll then conflict with a 'copyout(myptr)', as that one then sees unexpected reference counts. That's a different variant of the "[OpenACC] Deep copy attach/detach should not affect reference counts" problem? Basically (see WIP patch attached, 'libgomp.oacc-fortran/attach-descriptor-1_.f90'): call acc_copyin(tarr) ! 'rc(tarr) == 1' myptr => tarr !$acc enter data attach(myptr) ! 'rc(tarr) == 2'! (not intended by the user) !$acc enter data attach(myptr) ! 'rc(tarr) == 3'! (not intended by the user) [...] call acc_copyout(tarr) ! won't copyout, because still 'rc(tarr) = 2'! (not intended by the user) if (acc_is_present(tarr)) stop 12 ! fails Ugh. :-( Or am I confused now? Grüße Thomas ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
>From 4fa4979da3de6d15d5a39b77fdeb6b5aadec0f10 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tho...@codesourcery.com> Date: Fri, 17 Jul 2020 13:09:58 +0200 Subject: [PATCH] WIP Problems with "openacc: Fix standalone attach for Fortran assumed-shape array pointers" --- .../gfortran.dg/goacc/attach-descriptor.f90 | 11 ++++++++++- .../libgomp.oacc-fortran/attach-descriptor-1.f90 | 3 +++ ...descriptor-1.f90 => attach-descriptor-1_.f90} | 8 ++++++++ ...escriptor-1.f90 => attach-descriptor-1__.f90} | 16 +++++++++++++++- 4 files changed, 36 insertions(+), 2 deletions(-) copy libgomp/testsuite/libgomp.oacc-fortran/{attach-descriptor-1.f90 => attach-descriptor-1_.f90} (79%) copy libgomp/testsuite/libgomp.oacc-fortran/{attach-descriptor-1.f90 => attach-descriptor-1__.f90} (62%) diff --git a/gcc/testsuite/gfortran.dg/goacc/attach-descriptor.f90 b/gcc/testsuite/gfortran.dg/goacc/attach-descriptor.f90 index 9ca36f770c7..454ef9cccf3 100644 --- a/gcc/testsuite/gfortran.dg/goacc/attach-descriptor.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/attach-descriptor.f90 @@ -1,4 +1,4 @@ -! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-original -fdump-tree-gimple" } program att implicit none @@ -12,7 +12,16 @@ program att !$acc enter data attach(myvar%arr2, myptr) ! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(attach:myvar\\.arr2 \\\[bias: 0\\\]\\) map\\(alloc:\\*\\(c_char \\*\\) myptr\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(to:myptr \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(attach:\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) myptr\\.data \\\[bias: 0\\\]\\);$" 1 "original" } } +! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(attach:myvar\\.arr2 \\\[bias: 0\\\]\\) map\\(alloc:MEM\\\[\\(c_char \\*\\)\[^\\\]\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(to:myptr \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(attach:myptr\\.data \\\[bias: 0\\\]\\)$" 1 "gimple" } } !$acc exit data detach(myvar%arr2, myptr) ! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(detach:myvar\\.arr2 \\\[bias: 0\\\]\\) map\\(release:\\*\\(c_char \\*\\) myptr\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(to:myptr \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(detach:\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) myptr\\.data \\\[bias: 0\\\]\\);$" 1 "original" } } +! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(detach:myvar\\.arr2 \\\[bias: 0\\\]\\) map\\(release:MEM\\\[\\(c_char \\*\\)\[^\\\]\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(to:myptr \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(detach:myptr\\.data \\\[bias: 0\\\]\\)$" 1 "gimple" } } + + ! Test valid usage and processing of the finalize clause. + !$acc exit data detach(myvar%arr2, myptr) finalize +! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(detach:myvar\\.arr2 \\\[bias: 0\\\]\\) map\\(release:\\*\\(c_char \\*\\) myptr\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(to:myptr \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(detach:\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) myptr\\.data \\\[bias: 0\\\]\\) finalize;$" 1 "original" } } + !TODO See how in the 'gimple' dump, 'detach' is turned into 'force_detach', and 'release' into 'delete' -- but is the latter actually correct? (See 'libgomp.oacc-fortran/attach-descriptor-1__.f90'.) +! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(force_detach:myvar\\.arr2 \\\[bias: 0\\\]\\) map\\(delete:MEM\\\[\\(c_char \\*\\)\[^\\\]\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(to:myptr \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(force_detach:myptr\\.data \\\[bias: 0\\\]\\) finalize$" 1 "gimple" } } + end program att diff --git a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 index 5d79cbc14fc..99c1e787de6 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 @@ -41,8 +41,11 @@ program att !$acc exit data detach(myvar%arr2, myptr) call acc_copyout(myvar%arr2) + if (acc_is_present(myvar%arr2)) stop 10 call acc_copyout(myvar) + if (acc_is_present(myvar)) stop 11 call acc_copyout(tarr) + if (acc_is_present(tarr)) stop 12 do i=1,10 if (myvar%arr1(i) .ne. i) stop 1 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1_.f90 similarity index 79% copy from libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 copy to libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1_.f90 index 5d79cbc14fc..2e2e1267660 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1_.f90 @@ -26,6 +26,7 @@ program att myptr => tarr + !$acc enter data attach(myvar%arr2, myptr) !$acc enter data attach(myvar%arr2, myptr) ! FIXME: This warning is emitted on the wrong line number. @@ -39,10 +40,17 @@ program att !$acc end serial !$acc exit data detach(myvar%arr2, myptr) + !!$acc exit data detach(myvar%arr2, myptr) call acc_copyout(myvar%arr2) + if (acc_is_present(myvar%arr2)) stop 10 call acc_copyout(myvar) + if (acc_is_present(myvar)) stop 11 call acc_copyout(tarr) + if (acc_is_present(tarr)) stop 12 ! fails + + !TODO Have to stop, have copied out device pointers. + stop do i=1,10 if (myvar%arr1(i) .ne. i) stop 1 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1__.f90 similarity index 62% copy from libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 copy to libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1__.f90 index 5d79cbc14fc..6786f32852b 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1__.f90 @@ -27,6 +27,7 @@ program att myptr => tarr !$acc enter data attach(myvar%arr2, myptr) + !!$acc enter data attach(myvar%arr2, myptr) ! FIXME: This warning is emitted on the wrong line number. ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } 38 } @@ -38,11 +39,24 @@ program att myptr(3) = 99 !$acc end serial - !$acc exit data detach(myvar%arr2, myptr) + !$acc exit data detach(myvar%arr2, myptr) finalize + + if (.not. acc_is_present(myvar%arr2)) stop 10 + if (.not. acc_is_present(myvar)) stop 11 + if (.not. acc_is_present(tarr)) stop 12 ! fails call acc_copyout(myvar%arr2) + if (acc_is_present(myvar%arr2)) stop 20 + if (.not. acc_is_present(myvar)) stop 21 + if (.not. acc_is_present(tarr)) stop 22 call acc_copyout(myvar) + if (acc_is_present(myvar%arr2)) stop 30 + if (acc_is_present(myvar)) stop 31 + if (.not. acc_is_present(tarr)) stop 32 call acc_copyout(tarr) + if (acc_is_present(myvar%arr2)) stop 40 + if (acc_is_present(myvar)) stop 41 + if (acc_is_present(tarr)) stop 42 do i=1,10 if (myvar%arr1(i) .ne. i) stop 1 -- 2.17.1