On Thu, Aug 19, 2021 at 10:14 PM Thomas Schwinge
<tho...@codesourcery.com> wrote:
>
> Hi!
>
> Richard, maybe you have an opinion here, in particular about my
> "SLP vectorizer" comment below?  Please see
> <87r1f2puss.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87r1f2puss.fsf@euler.schwinge.homeip.net>
> for the full context.
>
> On 2021-08-16T10:21:04+0200, Jakub Jelinek <ja...@redhat.com> wrote:
> > On Mon, Aug 16, 2021 at 10:08:42AM +0200, Thomas Schwinge wrote:
> >>  /* Build COMPONENT_REF and set TREE_THIS_VOLATILE and TREE_READONLY on it
> >>     as appropriate.  */
> >>
> >>  tree
> >>  omp_build_component_ref (tree obj, tree field)
> >>  {
> >> +  tree field_type = TREE_TYPE (field);
> >> +  tree obj_type = TREE_TYPE (obj);
> >> +  if (!ADDR_SPACE_GENERIC_P (TYPE_ADDR_SPACE (obj_type)))
> >> +    field_type
> >> +      = build_qualified_type (field_type,
> >> +                          KEEP_QUAL_ADDR_SPACE (TYPE_QUALS (obj_type)));
>
> (For later reference: "Kwok's new code" here is to propagate to
> 'field_type' any non-generic address space of 'obj_type'.)
>
> |> Concerning the current 'gcc/omp-low.c:omp_build_component_ref', for the
> |> current set of offloading testcases, we never see a
> |> '!ADDR_SPACE_GENERIC_P' there, so the address space handling doesn't seem
> |> to be necessary there (but also won't do any harm: no-op).
> >
> > Are you sure this can't trigger?
> > Say
> > extern int __seg_fs a;
> >
> > void
> > foo (void)
> > {
> >   #pragma omp parallel private (a)
> >   a = 2;
> > }
>
> That test case doesn't run into 'omp_build_component_ref' at all,
> but I'm attaching an altered and extended variant that does,
> "Add 'libgomp.c/address-space-1.c'".  OK to push to master branch?
>
> In this case, 'omp_build_component_ref' called via host compilation
> 'pass_lower_omp', it's the 'field_type' that has 'address-space-1', not
> 'obj_type', so indeed Kwok's new code is a no-op:
>
>     (gdb) call debug_tree(field_type)
>      <pointer_type 0x7ffff7686b28
>         type <integer_type 0x7ffff7686498 int address-space-1 SI
>             size <integer_cst 0x7ffff7540f30 constant 32>
>             unit-size <integer_cst 0x7ffff7540f48 constant 4>
>             align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x7ffff7686498 precision:32 min <integer_cst 0x7ffff7540ee8 -2147483648> max 
> <integer_cst 0x7ffff7540f00 2147483647>
>             pointer_to_this <pointer_type 0x7ffff7686b28>>
>         unsigned DI
>         size <integer_cst 0x7ffff7540cf0 type <integer_type 0x7ffff75590a8 
> bitsizetype> constant 64>
>         unit-size <integer_cst 0x7ffff7540d08 type <integer_type 
> 0x7ffff7559000 sizetype> constant 8>
>         align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x7ffff7686b28>
>
>     (gdb) call debug_tree(obj_type)
>      <record_type 0x7ffff7686bd0 .omp_data_t.0 readonly DI
>         size <integer_cst 0x7ffff7540cf0 type <integer_type 0x7ffff75590a8 
> bitsizetype> constant 64>
>         unit-size <integer_cst 0x7ffff7540d08 type <integer_type 
> 0x7ffff7559000 sizetype> constant 8>
>         align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x7ffff7686bd0
>         fields <field_decl 0x7ffff7568428 a
>             type <pointer_type 0x7ffff7686b28 type <integer_type 
> 0x7ffff7686498 int address-space-1>
>                 unsigned DI size <integer_cst 0x7ffff7540cf0 64> unit-size 
> <integer_cst 0x7ffff7540d08 8>
>                 align:64 warn_if_not_align:0 symtab:0 alias-set -1 
> canonical-type 0x7ffff7686b28>
>             unsigned DI /home/thomas/shared/gcc/omp/as.c:4:14 size 
> <integer_cst 0x7ffff7540cf0 64> unit-size <integer_cst 0x7ffff7540d08 8>
>             align:64 warn_if_not_align:0 offset_align 128
>             offset <integer_cst 0x7ffff7540d20 constant 0>
>             bit-offset <integer_cst 0x7ffff7540d68 constant 0> context 
> <record_type 0x7ffff7686540 .omp_data_t.0>> reference_to_this <reference_type 
> 0x7ffff7686c78>>
>
> The case that Kwok's new code handles, however, is when 'obj_type' has a
> non-generic address space, and then propagates that one to 'field_type'.
>
> For a similar OpenACC example, 'omp_build_component_ref' called via GCN
> offloading compilation 'pass_omp_oacc_neuter_broadcast', we've got
> without Kwok's new code:
>
>     (gdb) call debug_tree(field_type)
>      <boolean_type 0x7ffff7550b28 bool public unsigned QI
>         size <integer_cst 0x7ffff754fa80 type <integer_type 0x7ffff75500a8 
> bitsizetype> constant 8>
>         unit-size <integer_cst 0x7ffff754fa98 type <integer_type 
> 0x7ffff7550000 sizetype> constant 1>
>         align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x7ffff7550b28 precision:1 min <integer_cst 0x7ffff754fcd8 0> max 
> <integer_cst 0x7ffff754fd08 1>>
>
>     (gdb) call debug_tree(obj_type)
>      <record_type 0x7ffff7631000 .oacc_ws_data_s.0 address-space-4 QI
>         size <integer_cst 0x7ffff754fa80 type <integer_type 0x7ffff75500a8 
> bitsizetype> constant 8>
>         unit-size <integer_cst 0x7ffff754fa98 type <integer_type 
> 0x7ffff7550000 sizetype> constant 1>
>         align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x7ffff7631000
>         fields <field_decl 0x7ffff762e260 _52
>             type <boolean_type 0x7ffff7550b28 bool public unsigned QI size 
> <integer_cst 0x7ffff754fa80 8> unit-size <integer_cst 0x7ffff754fa98 1>
>                 align:8 warn_if_not_align:0 symtab:0 alias-set -1 
> canonical-type 0x7ffff7550b28 precision:1 min <integer_cst 0x7ffff754fcd8 0> 
> max <integer_cst 0x7ffff754fd08 1>>
>             unsigned QI <built-in>:0:0 size <integer_cst 0x7ffff754fa80 8> 
> unit-size <integer_cst 0x7ffff754fa98 1>
>             align:8 warn_if_not_align:0 offset_align 64
>             offset <integer_cst 0x7ffff754f9c0 constant 0>
>             bit-offset <integer_cst 0x7ffff754fa08 constant 0> context 
> <record_type 0x7ffff7631000 .oacc_ws_data_s.0 address-space-4>>
>         pointer_to_this <pointer_type 0x7ffff7631498>>
>
> ..., and with Kwok's new code the 'address-space-4' of 'obj_type' is
> propagated to 'field_type':
>
>     (gdb) call debug_tree(field_type)
>      <boolean_type 0x7ffff7631540 bool address-space-4 unsigned QI
>         size <integer_cst 0x7ffff754fa80 type <integer_type 0x7ffff75500a8 
> bitsizetype> constant 8>
>         unit-size <integer_cst 0x7ffff754fa98 type <integer_type 
> 0x7ffff7550000 sizetype> constant 1>
>         align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x7ffff7631540 precision:1 min <integer_cst 0x7ffff754fcd8 0> max 
> <integer_cst 0x7ffff754fd08 1>>
>
> I'm not familiar enough with these bits to tell whether Kwok's new code
> is the right solution to this problem -- or if, for example, the problem
> is rather in the SLP vectorizer, where the ICE seems to ultimately
> emerge?
>
> Without (ICEs later) vs. with (works) Kwok's new code, we see the
> 'a.xamdgcn-amdhsa.mkoffload.175t.slp1' dump change as follows (word-diff,
> only additional '<address-space-4>', occasionally):
>
>     [...]
>       {+<address-space-4>+} vector(2) long int * vectp.58;
>       {+<address-space-4>+} vector(2) long int * vectp_.oacc_worker_o.57;
>       {+<address-space-4>+} vector(2) int * vectp.56;
>       {+<address-space-4>+} vector(2) int * vectp_.oacc_worker_o.55;
>     [...]
>       {+<address-space-4>+} long int * _104;
>     [...]
>       {+<address-space-4>+} long int * _108;
>     [...]
>       <address-space-4> void * _350;
>     [...]
>       _350 = __builtin_gcn_single_copy_start (&.oacc_worker_o.6);
>     [...]
>       MEM <{+<address-space-4>+} vector(2) long int> [(long int 
> *)&.oacc_worker_o.6] = _101;
>       _108 = &.oacc_worker_o.6._22 + 16;
>       MEM <{+<address-space-4>+} vector(2) long int> [(long int *)_108] = 
> _100;
>       _104 = &.oacc_worker_o.6._22 + 32;
>     [...]
>
> For example, with Kwok's new code, '_108' ('<address-space-4> long int *')
> is cast into '(long int *)' -- presumably synthesized in the SLP
> vectorizer?  Is that correct or shouldn't that cast also include
> '<address-space-4>'?
>
> I see a similar issue has been fixed a while ago: r245772 (Git commit
> c7d97b2846c5647a81548caa3264d77c0a595010) for PR79723
> "Another case of dropped gs: prefix", changing
> 'gcc/tree-vect-stmts.c:get_vectype_for_scalar_type_and_size' as follows:
>
>     +  /* Re-attach the address-space qualifier if we canonicalized the scalar
>     +     type.  */
>     +  if (TYPE_ADDR_SPACE (orig_scalar_type) != TYPE_ADDR_SPACE (vectype))
>     +    return build_qualified_type
>     +            (vectype, KEEP_QUAL_ADDR_SPACE (TYPE_QUALS 
> (orig_scalar_type)));
>     +
>        return vectype;
>
> (It looks a bit like the address space handling is quite fragile in GCC's
> 'tree' types/interfaces?  Do we have ideas about how to make that more
> robust, less "bolt-on"?)

If in doubt always look at what RTL expansion does - it looks like
set_mem_attributes expects the address-space qualifier to be
present on the type or in case it is passed an object, on the
type of the base, or in case of a dereference, on the pointed-to
type of the pointer (and yes, that does look somewhat fragile).

So it looks like the patch you refer to shouldn't fix anything and

>     +  /* Re-attach the address-space qualifier if we canonicalized the scalar
>     +     type.  */
>     +  if (TYPE_ADDR_SPACE (orig_scalar_type) != TYPE_ADDR_SPACE (vectype))
>     +    return build_qualified_type
>     +            (vectype, KEEP_QUAL_ADDR_SPACE (TYPE_QUALS 
> (orig_scalar_type)));

looks incomplete.  What you'd need to look for is MEM_REFs built
by the vectorizer and the address-space information on the pointers,
like generated from vect_create_data_ref_ptr.  It might also be that
data-ref analysis / SCEV looks through address-space qualifier changing
casts and thus we pick up the wrong address-space in the end.

What's the testcase that ICEs on trunk?

> I did add a few 'assert's for non-generic address space to
> 'gcc/tree-vect*', but have not yet located where things may be going
> wrong.
>
>
> > I think keeping the qual addr space here is the wrong thing to do,
> > it should keep the other quals and clear the address space instead,
> > the whole struct is going to be in generic addres space, isn't it?
>
> Correct for 'omp_build_component_ref' called via host compilation
> 'pass_lower_omp', but in the case of 'omp_build_component_ref' called via
> GCN offloading compilation 'pass_omp_oacc_neuter_broadcast', 'obj_type'
> has a non-generic address space.
>
> However, regarding the former comment -- shouldn't we force generic
> address space for all 'tree' types read in via LTO streaming for
> offloading compilation?  I assume that (in the general case) address
> spaces are never compatible between host and offloading compilation?
> For the attached "Add 'libgomp.c/address-space-1.c'", propagating the
> '__seg_fs' address space across the offloading boundary (assuming I did
> interpret the dumps correctly) doesn't seem to cause any problems, but
> maybe it's problematic for other cases?  (This is, however, a separate
> issue from what I'm discussing here.)
>
>
> >> +  tree ret = build3 (COMPONENT_REF, field_type, obj, field, NULL);
> >> +  if (TREE_THIS_VOLATILE (field))
> >> +    TREE_THIS_VOLATILE (ret) |= 1;
> >> +  if (TREE_READONLY (field))
> >> +    TREE_READONLY (ret) |= 1;
> >
> > When touching these two, shouldn't it be better written as
> > = 1; instead of |= 1; ?  For a bitfield...
>
> Yes, that was just copied from the original
> 'gcc/omp-general.c:omp_build_component_ref' -- but happy to simplify
> that, of course.
>
>
> Grüße
>  Thomas
>
>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

Reply via email to