Hi Chung-Lin!

On 2023-06-22T18:03:37+0800, Chung-Lin Tang via Gcc-patches 
<gcc-patches@gcc.gnu.org> wrote:
> This patch adjusts the implementation of acc_map_data/acc_unmap_data API 
> library
> routines to more fit the description in the OpenACC 2.7 specification.

Thanks!

> Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA
> special value to mark acc_map_data-created mappings, and allow adjustment of
> dynamic_refcount of such mappings by other constructs. Enforcing of an initial
> value of 1 for such mappings, and only allowing acc_unmap_data to delete such
> mappings, is implemented as specified.
>
> Actually, there is no real change (or improvement) in behavior of the API 
> (thus
> no new tests) I've looked at the related OpenACC spec issues, and it seems 
> that
> this part of the 2.7 spec change is mostly a clarification (see no downside in
> current REFCOUNT_INFINITY based implementation either).
> But this patch does make the internals more close to the spec description.

ACK, thanks.

> Tested without regressions using powerpc64le-linux/nvptx, okay for trunk?

A few comments, should be easy to work in:

> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -1166,6 +1166,8 @@ struct target_mem_desc;
>  /* Special value for refcount - tgt_offset contains target address of the
>     artificial pointer to "omp declare target link" object.  */
>  #define REFCOUNT_LINK     (REFCOUNT_SPECIAL | 1)
> +/* Special value for refcount - created through acc_map_data.  */
> +#define REFCOUNT_ACC_MAP_DATA (REFCOUNT_SPECIAL | 2)
>
>  /* Special value for refcount - structure element sibling list items.
>     All such key refounts have REFCOUNT_STRUCTELEM bits set, with _FLAG_FIRST

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c
> @@ -411,7 +411,8 @@ acc_map_data (void *h, void *d, size_t s)
>        assert (n->refcount == 1);
>        assert (n->dynamic_refcount == 0);
>        /* Special reference counting behavior.  */
> -      n->refcount = REFCOUNT_INFINITY;
> +      n->refcount = REFCOUNT_ACC_MAP_DATA;
> +      n->dynamic_refcount = 1;
>
>        if (profiling_p)
>       {
> @@ -460,7 +461,7 @@ acc_unmap_data (void *h)
>       the different 'REFCOUNT_INFINITY' cases, or simply separate
>       'REFCOUNT_INFINITY' values per different usage ('REFCOUNT_ACC_MAP_DATA'
>       etc.)?  */
> -  else if (n->refcount != REFCOUNT_INFINITY)
> +  else if (n->refcount != REFCOUNT_ACC_MAP_DATA)
>      {
>        gomp_mutex_unlock (&acc_dev->lock);
>        gomp_fatal ("refusing to unmap block [%p,+%d] that has not been mapped"

Thus remove the TODO comment before this 'else if' block?  :-)

We should add a comment here that we're unmapping without consideration
of 'n->dynamic_refcount' (that is, 'acc_unmap_data' has implicit
'finalize' semantics -- at least per my reading of the specification; do
you agree?), that is:

    acc_map_data([var]); // 'dynamic_refcount = 1'
    acc_copyin([var]); // 'dynamic_refcount++'
    acc_unmap_data([var]); // does un-map, despite 'dynamic_refcount == 2'?
    assert (!acc_is_present([var]));

Do we have such a test case?  If not, please add one.

To complement 'goacc_exit_datum_1' (see below), we should add here:

    assert (n->dynamic_refcount >= 1);

The subsequenct code:

    if (tgt->refcount == REFCOUNT_INFINITY)
      {
        gomp_mutex_unlock (&acc_dev->lock);
        gomp_fatal ("cannot unmap target block");
      }

... is now unreachable, I think, and may thus be removed -- and any
inconsistency is caught by the subsequent:

    /* Above, we've verified that the mapping must have been set up by
       'acc_map_data'.  */
    assert (tgt->refcount == 1);

> @@ -519,7 +520,8 @@ goacc_map_var_existing (struct gomp_device_descr 
> *acc_dev, void *hostaddr,
>      }
>
>    assert (n->refcount != REFCOUNT_LINK);
> -  if (n->refcount != REFCOUNT_INFINITY)
> +  if (n->refcount != REFCOUNT_INFINITY
> +      && n->refcount != REFCOUNT_ACC_MAP_DATA)
>      n->refcount++;
>    n->dynamic_refcount++;
>
> @@ -683,6 +685,7 @@ goacc_exit_datum_1 (struct gomp_device_descr *acc_dev, 
> void *h, size_t s,
>
>    assert (n->refcount != REFCOUNT_LINK);
>    if (n->refcount != REFCOUNT_INFINITY
> +      && n->refcount != REFCOUNT_ACC_MAP_DATA
>        && n->refcount < n->dynamic_refcount)
>      {
>        gomp_mutex_unlock (&acc_dev->lock);
> @@ -691,15 +694,27 @@ goacc_exit_datum_1 (struct gomp_device_descr *acc_dev, 
> void *h, size_t s,
>
>    if (finalize)
>      {
> -      if (n->refcount != REFCOUNT_INFINITY)
> +      if (n->refcount != REFCOUNT_INFINITY
> +       && n->refcount != REFCOUNT_ACC_MAP_DATA)
>       n->refcount -= n->dynamic_refcount;
> -      n->dynamic_refcount = 0;
> +
> +      if (n->refcount == REFCOUNT_ACC_MAP_DATA)
> +     /* Mappings created by acc_map_data are returned to initial
> +        dynamic_refcount of 1. Can only be deleted by acc_unmap_data.  */
> +     n->dynamic_refcount = 1;
> +      else
> +     n->dynamic_refcount = 0;
>      }
>    else if (n->dynamic_refcount)
>      {
> -      if (n->refcount != REFCOUNT_INFINITY)
> +      if (n->refcount != REFCOUNT_INFINITY
> +       && n->refcount != REFCOUNT_ACC_MAP_DATA)
>       n->refcount--;
> -      n->dynamic_refcount--;
> +
> +      /* When mapping is created by acc_map_data, dynamic_refcount must be
> +      maintained at >= 1.  */
> +      if (n->refcount != REFCOUNT_ACC_MAP_DATA || n->dynamic_refcount > 1)
> +     n->dynamic_refcount--;
>      }

I'd find those changes more concise to understand if done the following
way: restore both 'if (finalize)' and 'else if (n->dynamic_refcount)'
branches to their original form (other than excluding 'n->refcount'
modification for 'REFCOUNT_ACC_MAP_DATA', as you have), and instead then
afterwards (that is, here), do:

    /* Mappings created by 'acc_map_data' can only be deleted by 
'acc_unmap_data'.  */
    if (n->refcount == REFCOUNT_ACC_MAP_DATA
        && n->dynamic_refcount == 0)
      n->dynamic_refcount = 1;

That does have the same semantics, please verify?

>
>    if (n->refcount == 0)

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -480,7 +480,9 @@ gomp_increment_refcount (splay_tree_key k, htab_t 
> *refcount_set)
>
>    uintptr_t *refcount_ptr = &k->refcount;
>
> -  if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
> +  if (k->refcount == REFCOUNT_ACC_MAP_DATA)
> +    refcount_ptr = &k->dynamic_refcount;
> +  else if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
>      refcount_ptr = &k->structelem_refcount;
>    else if (REFCOUNT_STRUCTELEM_P (k->refcount))
>      refcount_ptr = k->structelem_refcount_ptr;
> @@ -527,7 +529,9 @@ gomp_decrement_refcount (splay_tree_key k, htab_t 
> *refcount_set, bool delete_p,
>
>    uintptr_t *refcount_ptr = &k->refcount;
>
> -  if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
> +  if (k->refcount == REFCOUNT_ACC_MAP_DATA)
> +    refcount_ptr = &k->dynamic_refcount;
> +  else if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
>      refcount_ptr = &k->structelem_refcount;
>    else if (REFCOUNT_STRUCTELEM_P (k->refcount))
>      refcount_ptr = k->structelem_refcount_ptr;
> @@ -560,6 +564,10 @@ gomp_decrement_refcount (splay_tree_key k, htab_t 
> *refcount_set, bool delete_p,
>    else if (*refcount_ptr > 0)
>      *refcount_ptr -= 1;
>
> +  /* Force back to 1 if this is an acc_map_data mapping.  */
> +  if (k->refcount == REFCOUNT_ACC_MAP_DATA && *refcount_ptr == 0)
> +    *refcount_ptr = 1;
> +
>   end:
>    if (*refcount_ptr == 0)
>      {

It's not clear to me why you need this handling -- instead of just
handling 'REFCOUNT_ACC_MAP_DATA' like 'REFCOUNT_INFINITY' here, that is,
early 'return'?

Per my understanding, this code is for OpenACC only exercised for
structured data regions, and it seems strange (unnecessary?) to adjust
the 'dynamic_refcount' for these for 'acc_map_data'-mapped data?  Or am I
missing anything?

> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c
> @@ -10,7 +10,7 @@ main (int argc, char *argv[])
>  {
>    acc_init (acc_device_default);
>    acc_unmap_data ((void *) foo);
> -/* { dg-output "libgomp: cannot unmap target block" } */
> +/* { dg-output "libgomp: refusing to unmap block 
> \\\[\[0-9a-fA-FxX\]+,\\\+64\\\] that has not been mapped by 'acc_map_data'" } 
> */
>    return 0;
>  }
>

Overall, your changes regress the
commit 3e888f94624294d2b9b34ebfee0916768e5d9c3f
"Add OpenACC 'acc_map_data' variant to 
'libgomp.oacc-c-c++-common/deep-copy-8.c'"
that I just pushed.  I think you just need to handle
'REFCOUNT_ACC_MAP_DATA' like 'REFCOUNT_INFINITY' in
'libgomp/oacc-mem.c:goacc_enter_data_internal', 'if (n && struct_p)'?
Please verify.

But please also to the "Minimal OpenACC variant corresponding to PR96668"
code in 'libgomp/oacc-mem.c:goacc_enter_data_internal' add a safeguard
that we're not running into 'REFCOUNT_ACC_MAP_DATA' there.  I think
that's currently not (reasonably easily) possible, given that
'acc_map_data' isn't available in OpenACC/Fortran, but it'll be available
later, and then I'd rather have an 'assert' trigger there, instead of
random behavior.  (I'm not asking you to write a mixed OpenACC/Fortran
plus C test case for that scenario -- if feasible at all.)


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Reply via email to