On 9 November 2016 at 09:36, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguent...@suse.de> wrote: >> On Mon, 7 Nov 2016, Christophe Lyon wrote: >> >>> Hi Richard, >>> >>> >>> On 7 November 2016 at 09:01, Richard Biener <rguent...@suse.de> wrote: >>> > >>> > The following fixes an oversight when computing alignment in the >>> > vectorizer. >>> > >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. >>> > >>> > Richard. >>> > >>> > 2016-11-07 Richard Biener <rguent...@suse.de> >>> > >>> > PR tree-optimization/78189 >>> > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix >>> > alignment computation. >>> > >>> > * g++.dg/torture/pr78189.C: New testcase. >>> > >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C >>> > =================================================================== >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C (revision 0) >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C (working copy) >>> > @@ -0,0 +1,41 @@ >>> > +/* { dg-do run } */ >>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } >>> > */ >>> > + >>> > +#include <cstddef> >>> > + >>> > +struct A >>> > +{ >>> > + void * a; >>> > + void * b; >>> > +}; >>> > + >>> > +struct alignas(16) B >>> > +{ >>> > + void * pad; >>> > + void * misaligned; >>> > + void * pad2; >>> > + >>> > + A a; >>> > + >>> > + void Null(); >>> > +}; >>> > + >>> > +void B::Null() >>> > +{ >>> > + a.a = nullptr; >>> > + a.b = nullptr; >>> > +} >>> > + >>> > +void __attribute__((noinline,noclone)) >>> > +NullB(void * misalignedPtr) >>> > +{ >>> > + B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - >>> > offsetof(B, misaligned)); >>> > + b->Null(); >>> > +} >>> > + >>> > +int main() >>> > +{ >>> > + B b; >>> > + NullB(&b.misaligned); >>> > + return 0; >>> > +} >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c >>> > index 9346cfe..b03cb1e 100644 >>> > --- gcc/tree-vect-data-refs.c >>> > +++ gcc/tree-vect-data-refs.c >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct >>> > data_reference *dr) >>> > base = ref; >>> > while (handled_component_p (base)) >>> > base = TREE_OPERAND (base, 0); >>> > + unsigned int base_alignment; >>> > + unsigned HOST_WIDE_INT base_bitpos; >>> > + get_object_alignment_1 (base, &base_alignment, &base_bitpos); >>> > + /* As data-ref analysis strips the MEM_REF down to its base operand >>> > + to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to >>> > + adjust things to make base_alignment valid as the alignment of >>> > + DR_BASE_ADDRESS. */ >>> > if (TREE_CODE (base) == MEM_REF) >>> > - base = build2 (MEM_REF, TREE_TYPE (base), base_addr, >>> > - build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0)); >>> > - unsigned int base_alignment = get_object_alignment (base); >>> > + { >>> > + base_bitpos -= mem_ref_offset (base).to_short_addr () * >>> > BITS_PER_UNIT; >>> > + base_bitpos &= (base_alignment - 1); >>> > + } >>> > + if (base_bitpos != 0) >>> > + base_alignment = base_bitpos & -base_bitpos; >>> > + /* Also look at the alignment of the base address DR analysis >>> > + computed. */ >>> > + unsigned int base_addr_alignment = get_pointer_alignment (base_addr); >>> > + if (base_addr_alignment > base_alignment) >>> > + base_alignment = base_addr_alignment; >>> > >>> > if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype))) >>> > DR_VECT_AUX (dr)->base_element_aligned = true; >>> >>> Since you committed this patch (r241892), I'm seeing execution failures: >>> gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test >>> gcc.dg/vect/pr40074.c execution test >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9 >>> --with-fpu=neon-fp16 >>> (using qemu as simulator) >> >> The difference is that we now vectorize the testcase with versioning >> for alignment (but it should never execute the vectorized variant). >> I need arm peoples help to understand what is wrong. > Hi All, > I will look at it. >
Hi, This is causing new regressions on armeb: gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 2 loops" 1 gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect "vectorized 2 loops" 1 pr470074 passes now, Thanks, Christophe > Thanks, > bin >> >> At least the testcase shows there is (kind-of) a missed optimization >> that we no longer figure out versioning for alignment is useless. >> I'll look into that. >> >> Richard.