[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

rguenth at gcc dot gnu.org via Gcc-bugs Thu, 19 Oct 2023 23:26:18 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at suse dot de

--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #24)
> (In reply to Richard Biener from comment #22)
> > I see the mems properly get their base adjusted:
> > 
> > (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> >                 (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
> > S16 A128])
> >         (reg:V2DI 616)) -1
> >      (nil))
> > 
> > vs.
> > 
> > (insn 389 388 390 (set (reg:HI 619)
> >         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> >                 (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
> > *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
> >      (nil))
> > 
> > both are based off a fake _10.  But we get alias sets 7 and 4 used here
> > which might be a problem.
> > 
> > See update_alias_info_with_stack_vars and uses of decls_to_pointers,
> > in particular from set_mem_attributes_minus_bitpos where we preserve
> > TBAA info with the rewrite.  I'm not sure why that should be OK ...
> > (but I'm sure I must have thought of this problem back in time)
> > 
> > Does the following fix the testcase?
> > 
> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> > index 84b6833225e..81c0a63eddc 100644
> > --- a/gcc/emit-rtl.cc
> > +++ b/gcc/emit-rtl.cc
> > @@ -2128,7 +2128,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int
> > objectp,
> >               tree *orig_base = &attrs.expr;
> >               while (handled_component_p (*orig_base))
> >                 orig_base = &TREE_OPERAND (*orig_base, 0);
> > -             tree aptrt = reference_alias_ptr_type (*orig_base);
> > +             tree aptrt = ptr_type_node;
> >               *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
> >                                    build_int_cst (aptrt, 0));
> >             }
> 
> Sorry, this doesn't help.
> 
> I noticed that it makes insns 384 and 389 become to:
> 
> (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
>                 (const_int 16 [0x10])) [7 MEM <struct Vec128D.30433>
> [(voidD.48 *)_10]+0 S16 A128])
>         (reg:V2DI 616)) -1
>      (nil))
> 
> (insn 389 388 390 (set (reg:HI 619)
>         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
>                 (const_int 16 [0x10])) [4 MEM <struct Vec128D.30212>
> [(voidD.48 *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
>      (nil))
> 
> alias sets are not changed.

Ah, probably the alias-set is determined from the unmangled ref ...

> Aggressively further hacking with attrs.alias =
> 0 can make it pass. Can we make an new alias set for each partition? then
> all involved decls in the same partition is aliased. For a particular
> involved decl, it's aliased to the previous ones and the new ones in its own
> partitions.

hmm, no - this won't work.  In fact even attrs.alias = 0 will probably
not work reliably since we can coalesce variables that escape and thus
the above will only alter accesses via the original decls but not any
accesses done via pointers.  So indeed any alias-set mangling is pointless
here.

Consider

 {
   A x;
   int * volatile p = &x;
   *p = 1;
   .. = *p;
 }
 {
   B y;
   float * volatile q = &y;
   *q = 1;
   .. = *q;
 }

if we coalesce x and y then we are not rewriting any accesses
but obviously the accesses still need to conflict - but the
indirect accesses will have their original non-conflicting alias-set
and thus the scheduler would be free to move the store to *q across
the load from *p (the "trick" would be to make an incentive to do so
of course).

That means we'd have to constrain code motion of accesses to remain
within the declared lifetime of the objects not only on GIMPLE but also
on RTL.  I don't see how we can do that, not even with any of the proposed
fixes to the stack-reuse issues :/  The whole point of the stack
coalescing code is to allow addresses of objects to escape - thus not
requiring to see all accesses but rely on markers in the IL constraining
object lifetime.  And those markers are thrown away - but they also only
are barriers for TBAA compatible accesses, not "storage-reuse" accesses
that the GIMPLE memory model (and C++ with placement new) allows :/

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

Reply via email to