Hi Chung-Lin! On 2023-06-22T18:03:37+0800, Chung-Lin Tang via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > This patch adjusts the implementation of acc_map_data/acc_unmap_data API > library > routines to more fit the description in the OpenACC 2.7 specification.
Thanks! > Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA > special value to mark acc_map_data-created mappings, and allow adjustment of > dynamic_refcount of such mappings by other constructs. Enforcing of an initial > value of 1 for such mappings, and only allowing acc_unmap_data to delete such > mappings, is implemented as specified. > > Actually, there is no real change (or improvement) in behavior of the API > (thus > no new tests) I've looked at the related OpenACC spec issues, and it seems > that > this part of the 2.7 spec change is mostly a clarification (see no downside in > current REFCOUNT_INFINITY based implementation either). > But this patch does make the internals more close to the spec description. ACK, thanks. > Tested without regressions using powerpc64le-linux/nvptx, okay for trunk? A few comments, should be easy to work in: > --- a/libgomp/libgomp.h > +++ b/libgomp/libgomp.h > @@ -1166,6 +1166,8 @@ struct target_mem_desc; > /* Special value for refcount - tgt_offset contains target address of the > artificial pointer to "omp declare target link" object. */ > #define REFCOUNT_LINK (REFCOUNT_SPECIAL | 1) > +/* Special value for refcount - created through acc_map_data. */ > +#define REFCOUNT_ACC_MAP_DATA (REFCOUNT_SPECIAL | 2) > > /* Special value for refcount - structure element sibling list items. > All such key refounts have REFCOUNT_STRUCTELEM bits set, with _FLAG_FIRST > --- a/libgomp/oacc-mem.c > +++ b/libgomp/oacc-mem.c > @@ -411,7 +411,8 @@ acc_map_data (void *h, void *d, size_t s) > assert (n->refcount == 1); > assert (n->dynamic_refcount == 0); > /* Special reference counting behavior. */ > - n->refcount = REFCOUNT_INFINITY; > + n->refcount = REFCOUNT_ACC_MAP_DATA; > + n->dynamic_refcount = 1; > > if (profiling_p) > { > @@ -460,7 +461,7 @@ acc_unmap_data (void *h) > the different 'REFCOUNT_INFINITY' cases, or simply separate > 'REFCOUNT_INFINITY' values per different usage ('REFCOUNT_ACC_MAP_DATA' > etc.)? */ > - else if (n->refcount != REFCOUNT_INFINITY) > + else if (n->refcount != REFCOUNT_ACC_MAP_DATA) > { > gomp_mutex_unlock (&acc_dev->lock); > gomp_fatal ("refusing to unmap block [%p,+%d] that has not been mapped" Thus remove the TODO comment before this 'else if' block? :-) We should add a comment here that we're unmapping without consideration of 'n->dynamic_refcount' (that is, 'acc_unmap_data' has implicit 'finalize' semantics -- at least per my reading of the specification; do you agree?), that is: acc_map_data([var]); // 'dynamic_refcount = 1' acc_copyin([var]); // 'dynamic_refcount++' acc_unmap_data([var]); // does un-map, despite 'dynamic_refcount == 2'? assert (!acc_is_present([var])); Do we have such a test case? If not, please add one. To complement 'goacc_exit_datum_1' (see below), we should add here: assert (n->dynamic_refcount >= 1); The subsequenct code: if (tgt->refcount == REFCOUNT_INFINITY) { gomp_mutex_unlock (&acc_dev->lock); gomp_fatal ("cannot unmap target block"); } ... is now unreachable, I think, and may thus be removed -- and any inconsistency is caught by the subsequent: /* Above, we've verified that the mapping must have been set up by 'acc_map_data'. */ assert (tgt->refcount == 1); > @@ -519,7 +520,8 @@ goacc_map_var_existing (struct gomp_device_descr > *acc_dev, void *hostaddr, > } > > assert (n->refcount != REFCOUNT_LINK); > - if (n->refcount != REFCOUNT_INFINITY) > + if (n->refcount != REFCOUNT_INFINITY > + && n->refcount != REFCOUNT_ACC_MAP_DATA) > n->refcount++; > n->dynamic_refcount++; > > @@ -683,6 +685,7 @@ goacc_exit_datum_1 (struct gomp_device_descr *acc_dev, > void *h, size_t s, > > assert (n->refcount != REFCOUNT_LINK); > if (n->refcount != REFCOUNT_INFINITY > + && n->refcount != REFCOUNT_ACC_MAP_DATA > && n->refcount < n->dynamic_refcount) > { > gomp_mutex_unlock (&acc_dev->lock); > @@ -691,15 +694,27 @@ goacc_exit_datum_1 (struct gomp_device_descr *acc_dev, > void *h, size_t s, > > if (finalize) > { > - if (n->refcount != REFCOUNT_INFINITY) > + if (n->refcount != REFCOUNT_INFINITY > + && n->refcount != REFCOUNT_ACC_MAP_DATA) > n->refcount -= n->dynamic_refcount; > - n->dynamic_refcount = 0; > + > + if (n->refcount == REFCOUNT_ACC_MAP_DATA) > + /* Mappings created by acc_map_data are returned to initial > + dynamic_refcount of 1. Can only be deleted by acc_unmap_data. */ > + n->dynamic_refcount = 1; > + else > + n->dynamic_refcount = 0; > } > else if (n->dynamic_refcount) > { > - if (n->refcount != REFCOUNT_INFINITY) > + if (n->refcount != REFCOUNT_INFINITY > + && n->refcount != REFCOUNT_ACC_MAP_DATA) > n->refcount--; > - n->dynamic_refcount--; > + > + /* When mapping is created by acc_map_data, dynamic_refcount must be > + maintained at >= 1. */ > + if (n->refcount != REFCOUNT_ACC_MAP_DATA || n->dynamic_refcount > 1) > + n->dynamic_refcount--; > } I'd find those changes more concise to understand if done the following way: restore both 'if (finalize)' and 'else if (n->dynamic_refcount)' branches to their original form (other than excluding 'n->refcount' modification for 'REFCOUNT_ACC_MAP_DATA', as you have), and instead then afterwards (that is, here), do: /* Mappings created by 'acc_map_data' can only be deleted by 'acc_unmap_data'. */ if (n->refcount == REFCOUNT_ACC_MAP_DATA && n->dynamic_refcount == 0) n->dynamic_refcount = 1; That does have the same semantics, please verify? > > if (n->refcount == 0) > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -480,7 +480,9 @@ gomp_increment_refcount (splay_tree_key k, htab_t > *refcount_set) > > uintptr_t *refcount_ptr = &k->refcount; > > - if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount)) > + if (k->refcount == REFCOUNT_ACC_MAP_DATA) > + refcount_ptr = &k->dynamic_refcount; > + else if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount)) > refcount_ptr = &k->structelem_refcount; > else if (REFCOUNT_STRUCTELEM_P (k->refcount)) > refcount_ptr = k->structelem_refcount_ptr; > @@ -527,7 +529,9 @@ gomp_decrement_refcount (splay_tree_key k, htab_t > *refcount_set, bool delete_p, > > uintptr_t *refcount_ptr = &k->refcount; > > - if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount)) > + if (k->refcount == REFCOUNT_ACC_MAP_DATA) > + refcount_ptr = &k->dynamic_refcount; > + else if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount)) > refcount_ptr = &k->structelem_refcount; > else if (REFCOUNT_STRUCTELEM_P (k->refcount)) > refcount_ptr = k->structelem_refcount_ptr; > @@ -560,6 +564,10 @@ gomp_decrement_refcount (splay_tree_key k, htab_t > *refcount_set, bool delete_p, > else if (*refcount_ptr > 0) > *refcount_ptr -= 1; > > + /* Force back to 1 if this is an acc_map_data mapping. */ > + if (k->refcount == REFCOUNT_ACC_MAP_DATA && *refcount_ptr == 0) > + *refcount_ptr = 1; > + > end: > if (*refcount_ptr == 0) > { It's not clear to me why you need this handling -- instead of just handling 'REFCOUNT_ACC_MAP_DATA' like 'REFCOUNT_INFINITY' here, that is, early 'return'? Per my understanding, this code is for OpenACC only exercised for structured data regions, and it seems strange (unnecessary?) to adjust the 'dynamic_refcount' for these for 'acc_map_data'-mapped data? Or am I missing anything? > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c > @@ -10,7 +10,7 @@ main (int argc, char *argv[]) > { > acc_init (acc_device_default); > acc_unmap_data ((void *) foo); > -/* { dg-output "libgomp: cannot unmap target block" } */ > +/* { dg-output "libgomp: refusing to unmap block > \\\[\[0-9a-fA-FxX\]+,\\\+64\\\] that has not been mapped by 'acc_map_data'" } > */ > return 0; > } > Overall, your changes regress the commit 3e888f94624294d2b9b34ebfee0916768e5d9c3f "Add OpenACC 'acc_map_data' variant to 'libgomp.oacc-c-c++-common/deep-copy-8.c'" that I just pushed. I think you just need to handle 'REFCOUNT_ACC_MAP_DATA' like 'REFCOUNT_INFINITY' in 'libgomp/oacc-mem.c:goacc_enter_data_internal', 'if (n && struct_p)'? Please verify. But please also to the "Minimal OpenACC variant corresponding to PR96668" code in 'libgomp/oacc-mem.c:goacc_enter_data_internal' add a safeguard that we're not running into 'REFCOUNT_ACC_MAP_DATA' there. I think that's currently not (reasonably easily) possible, given that 'acc_map_data' isn't available in OpenACC/Fortran, but it'll be available later, and then I'd rather have an 'assert' trigger there, instead of random behavior. (I'm not asking you to write a mixed OpenACC/Fortran plus C test case for that scenario -- if feasible at all.) Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955