On Fri, 13 Dec 2019 16:25:25 +0100
Thomas Schwinge <tho...@codesourcery.com> wrote:

> Hi Julian!
> 
> On 2019-10-29T12:15:01+0000, Julian Brown <jul...@codesourcery.com>
> wrote:
> >  static int
> > -find_pointer (int pos, size_t mapnum, unsigned short *kinds)
> > +find_group_last (int pos, size_t mapnum, unsigned short *kinds)
> >  {
> > -  if (pos + 1 >= mapnum)
> > -    return 0;
> > +  unsigned char kind0 = kinds[pos] & 0xff;
> > +  int first_pos = pos, last_pos = pos;
> >  
> > -  unsigned char kind = kinds[pos+1] & 0xff;
> > -
> > -  if (kind == GOMP_MAP_TO_PSET)
> > -    return 3;
> > -  else if (kind == GOMP_MAP_POINTER)
> > -    return 2;
> > +  if (kind0 == GOMP_MAP_TO_PSET)
> > +    {
> > +      while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) ==
> > GOMP_MAP_POINTER)
> > +   last_pos = ++pos;
> > +      /* We expect at least one GOMP_MAP_POINTER after a
> > GOMP_MAP_TO_PSET.  */
> > +      assert (last_pos > first_pos);
> > +    }
> > +  else
> > +    {
> > +      /* GOMP_MAP_ALWAYS_POINTER can only appear directly after
> > some other
> > +    mapping.  */
> > +      if (pos + 1 < mapnum
> > +     && (kinds[pos + 1] & 0xff) == GOMP_MAP_ALWAYS_POINTER)
> > +   return pos + 1;
> > +
> > +      /* We can have one or several GOMP_MAP_POINTER mappings
> > after a to/from
> > +    (etc.) mapping.  */
> > +      while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) ==
> > GOMP_MAP_POINTER)
> > +   last_pos = ++pos;
> > +    }
> >  
> > -  return 0;
> > +  return last_pos;
> >  }  
> 
> So I ran a simple experiment where I did:
> 
>     assert (find_group_last (i, mapnum, kinds) == i + pointer);
> 
> ... where 'pointer' is the current 'find_pointer' function.  (That is,
> compare that the old and new way are doing the same things, given the
> current GCC code generation/test cases.)
> 
> This 'assert' triggers for a few test cases:
> 'libgomp.oacc-fortran/allocatable-array-1.f90',
> 'libgomp.oacc-fortran/data-2.f90', 'libgomp.oacc-fortran/data-3.f90',
> 'libgomp.oacc-fortran/data-4-2.f90',
> 'libgomp.oacc-fortran/data-4.f90', 'libgomp.oacc-fortran/data-5.f90',
> 'libgomp.oacc-fortran/if-1.f90',
> 'libgomp.oacc-fortran/optional-data-enter-exit.f90'.  (Maybe those
> are the only ones actually using that stuff?)
> 
> I looked into the first one
> ('libgomp.oacc-fortran/allocatable-array-1.f90'), and for:
> 
>     integer, parameter :: n = 40
>     integer, allocatable :: ar(:,:,:)
> 
>     allocate (ar(1:n,0:n-1,0:n-1))
>     !$acc enter data copyin (ar)
> 
> ... found:
> 
>     (gdb) print mapnum
>     $2 = 3
>     (gdb) print kinds[0]
>     $3 = 1 // GOMP_MAP_TO
>     (gdb) print kinds[1]
>     $4 = 773
>     (gdb) print kinds[1] & 0xff
>     $5 = 5 // GOMP_MAP_TO_PSET
>     (gdb) print kinds[2]
>     $6 = 772
>     (gdb) print kinds[2] & 0xff
>     $7 = 4 // GOMP_MAP_POINTER
> 
> Current behavior: 'find_pointer (0, mapnum, kinds) == 3', so all three
> get mapped as one group.
> 
> New behavior: 'find_group_last (0, mapnum, kinds) == 0', so the
> 'GOMP_MAP_TO' gets mapped alone.  Then, 'find_group_last (1, mapnum,
> kinds) == 2', so the 'GOMP_MAP_TO_PSET', 'GOMP_MAP_POINTER' get
> mapped as one group.
> 
> Is that intentional?

Yes. In a previous iteration of the refcount overhaul patch, we had the
"magic" code fragment:

> +           for (int j = 0; j < 2; j++)  
> +             gomp_map_vars_async (acc_dev, aq,
> +                                  (j == 0 || pointer == 2) ? 1 : 2,
> +                                  &hostaddrs[i + j], NULL,
> +                                  &sizes[i + j], &kinds[i + j], true,
> +                                  GOMP_MAP_VARS_OPENACC_ENTER_DATA);  

The "pointer == 3" case here will do precisely the same thing as the
current iteration of the patch: pass the GOMP_MAP_TO to one
gomp_map_vars_async call, and pass the GOMP_MAP_TO_PSET +
GOMP_MAP_POINTER as a pair in a second call.

The "pointer == 2" case (i.e. with a GOMP_MAP_TO and a
GOMP_MAP_POINTER) will also handle the mappings separately in both the
earlier patch iteration and this one.

That's different from the current behaviour, because we don't want all
three mappings to be bound together. The problematic cases of doing
so might only appear with the manual deep copy patch applied also,
though (and/or with the refcount-checking patch applied/enabled). (I
don't remember exactly which test cases this affected, but I can check.)

The GOMP_MAP_TO_PSET plus the following GOMP_MAP_POINTER mappings are
treated as a group within gomp_map_vars_internal. So I'm not sure... 

> Any then, compating that to
> 'libgomp/target.c:GOMP_target_enter_exit_data', where (aside from
> 'GOMP_MAP_STRUCT'; not relevant for us right now, yes?) everything
> always gets mapped alone:
> 
>     for (i = 0; i < mapnum; i++)
>       if ((kinds[i] & 0xff) == GOMP_MAP_STRUCT)
>         { [...] }
>       else
>         gomp_map_vars (devicep, 1, &hostaddrs[i], NULL, &sizes[i],
> &kinds[i], true, GOMP_MAP_VARS_ENTER_DATA);
> 
> Is it just an "accident" that for OpenACC we were and still are going
> to do this differently, or is there an actual reason?

...why mapping one-at-a-time is the right thing to do here. Maybe the
OpenMP version never sees GOMP_MAP_TO_PSET (or GOMP_MAP_ALWAYS_POINTER,
which has a hard-wired dependency on the previous clause)? (I can try
to check that too.)

Thanks,

Julian

Reply via email to