On Tue, 8 Nov 2016, Richard Biener wrote:

> On Tue, 8 Nov 2016, Richard Biener wrote:
> 
> > On Mon, 7 Nov 2016, Christophe Lyon wrote:
> > 
> > > Hi Richard,
> > > 
> > > 
> > > On 7 November 2016 at 09:01, Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > The following fixes an oversight when computing alignment in the
> > > > vectorizer.
> > > >
> > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > > >
> > > > Richard.
> > > >
> > > > 2016-11-07  Richard Biener  <rguent...@suse.de>
> > > >
> > > >         PR tree-optimization/78189
> > > >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> > > >         alignment computation.
> > > >
> > > >         * g++.dg/torture/pr78189.C: New testcase.
> > > >
> > > > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> > > > ===================================================================
> > > > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> > > > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> > > > @@ -0,0 +1,41 @@
> > > > +/* { dg-do run } */
> > > > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" 
> > > > } */
> > > > +
> > > > +#include <cstddef>
> > > > +
> > > > +struct A
> > > > +{
> > > > +  void * a;
> > > > +  void * b;
> > > > +};
> > > > +
> > > > +struct alignas(16) B
> > > > +{
> > > > +  void * pad;
> > > > +  void * misaligned;
> > > > +  void * pad2;
> > > > +
> > > > +  A a;
> > > > +
> > > > +  void Null();
> > > > +};
> > > > +
> > > > +void B::Null()
> > > > +{
> > > > +  a.a = nullptr;
> > > > +  a.b = nullptr;
> > > > +}
> > > > +
> > > > +void __attribute__((noinline,noclone))
> > > > +NullB(void * misalignedPtr)
> > > > +{
> > > > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) 
> > > > - offsetof(B, misaligned));
> > > > +  b->Null();
> > > > +}
> > > > +
> > > > +int main()
> > > > +{
> > > > +  B b;
> > > > +  NullB(&b.misaligned);
> > > > +  return 0;
> > > > +}
> > > > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> > > > index 9346cfe..b03cb1e 100644
> > > > --- gcc/tree-vect-data-refs.c
> > > > +++ gcc/tree-vect-data-refs.c
> > > > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct 
> > > > data_reference *dr)
> > > >    base = ref;
> > > >    while (handled_component_p (base))
> > > >      base = TREE_OPERAND (base, 0);
> > > > +  unsigned int base_alignment;
> > > > +  unsigned HOST_WIDE_INT base_bitpos;
> > > > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> > > > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> > > > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> > > > +     adjust things to make base_alignment valid as the alignment of
> > > > +     DR_BASE_ADDRESS.  */
> > > >    if (TREE_CODE (base) == MEM_REF)
> > > > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> > > > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 
> > > > 0));
> > > > -  unsigned int base_alignment = get_object_alignment (base);
> > > > +    {
> > > > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * 
> > > > BITS_PER_UNIT;
> > > > +      base_bitpos &= (base_alignment - 1);
> > > > +    }
> > > > +  if (base_bitpos != 0)
> > > > +    base_alignment = base_bitpos & -base_bitpos;
> > > > +  /* Also look at the alignment of the base address DR analysis
> > > > +     computed.  */
> > > > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> > > > +  if (base_addr_alignment > base_alignment)
> > > > +    base_alignment = base_addr_alignment;
> > > >
> > > >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> > > >      DR_VECT_AUX (dr)->base_element_aligned = true;
> > > 
> > > Since you committed this patch (r241892), I'm seeing execution failures:
> > >   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
> > >   gcc.dg/vect/pr40074.c execution test
> > > on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> > > --with-fpu=neon-fp16
> > > (using qemu as simulator)
> > 
> > The difference is that we now vectorize the testcase with versioning
> > for alignment (but it should never execute the vectorized variant).
> > I need arm peoples help to understand what is wrong.
> > 
> > At least the testcase shows there is (kind-of) a missed optimization
> > that we no longer figure out versioning for alignment is useless.
> > I'll look into that.
> 
> I'm testing the following which restores previous behavior.  You
> might still want to figure out what went wrong.

This is what I have applied.

Bootstrapped / regtested on x86_64-unknown-linux-gnu.

Richard.

2016-11-09  Richard Biener  <rguent...@suse.de>

        * tree-vect-data-refs.c (vect_compute_data_ref_alignment):
        Look at the DR_BASE_ADDRESS object for forcing alignment.

Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c   (revision 241990)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -813,12 +813,9 @@ vect_compute_data_ref_alignment (struct
 
   if (base_alignment < TYPE_ALIGN (vectype))
     {
-      /* Strip an inner MEM_REF to a bare decl if possible.  */
-      if (TREE_CODE (base) == MEM_REF
-         && integer_zerop (TREE_OPERAND (base, 1))
-         && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
-       base = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
-
+      base = base_addr;
+      if (TREE_CODE (base) == ADDR_EXPR)
+       base = TREE_OPERAND (base, 0);
       if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype)))
        {
          if (dump_enabled_p ())

Reply via email to