On 11 November 2016 at 09:56, Richard Biener <rguent...@suse.de> wrote: > On Thu, 10 Nov 2016, Christophe Lyon wrote: > >> On 10 November 2016 at 09:34, Richard Biener <rguent...@suse.de> wrote: >> > On Wed, 9 Nov 2016, Christophe Lyon wrote: >> > >> >> On 9 November 2016 at 09:36, Bin.Cheng <amker.ch...@gmail.com> wrote: >> >> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguent...@suse.de> >> >> > wrote: >> >> >> On Mon, 7 Nov 2016, Christophe Lyon wrote: >> >> >> >> >> >>> Hi Richard, >> >> >>> >> >> >>> >> >> >>> On 7 November 2016 at 09:01, Richard Biener <rguent...@suse.de> wrote: >> >> >>> > >> >> >>> > The following fixes an oversight when computing alignment in the >> >> >>> > vectorizer. >> >> >>> > >> >> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to >> >> >>> > trunk. >> >> >>> > >> >> >>> > Richard. >> >> >>> > >> >> >>> > 2016-11-07 Richard Biener <rguent...@suse.de> >> >> >>> > >> >> >>> > PR tree-optimization/78189 >> >> >>> > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): >> >> >>> > Fix >> >> >>> > alignment computation. >> >> >>> > >> >> >>> > * g++.dg/torture/pr78189.C: New testcase. >> >> >>> > >> >> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C >> >> >>> > =================================================================== >> >> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C (revision 0) >> >> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C (working copy) >> >> >>> > @@ -0,0 +1,41 @@ >> >> >>> > +/* { dg-do run } */ >> >> >>> > +/* { dg-additional-options "-ftree-slp-vectorize >> >> >>> > -fno-vect-cost-model" } */ >> >> >>> > + >> >> >>> > +#include <cstddef> >> >> >>> > + >> >> >>> > +struct A >> >> >>> > +{ >> >> >>> > + void * a; >> >> >>> > + void * b; >> >> >>> > +}; >> >> >>> > + >> >> >>> > +struct alignas(16) B >> >> >>> > +{ >> >> >>> > + void * pad; >> >> >>> > + void * misaligned; >> >> >>> > + void * pad2; >> >> >>> > + >> >> >>> > + A a; >> >> >>> > + >> >> >>> > + void Null(); >> >> >>> > +}; >> >> >>> > + >> >> >>> > +void B::Null() >> >> >>> > +{ >> >> >>> > + a.a = nullptr; >> >> >>> > + a.b = nullptr; >> >> >>> > +} >> >> >>> > + >> >> >>> > +void __attribute__((noinline,noclone)) >> >> >>> > +NullB(void * misalignedPtr) >> >> >>> > +{ >> >> >>> > + B* b = reinterpret_cast<B*>(reinterpret_cast<char >> >> >>> > *>(misalignedPtr) - offsetof(B, misaligned)); >> >> >>> > + b->Null(); >> >> >>> > +} >> >> >>> > + >> >> >>> > +int main() >> >> >>> > +{ >> >> >>> > + B b; >> >> >>> > + NullB(&b.misaligned); >> >> >>> > + return 0; >> >> >>> > +} >> >> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c >> >> >>> > index 9346cfe..b03cb1e 100644 >> >> >>> > --- gcc/tree-vect-data-refs.c >> >> >>> > +++ gcc/tree-vect-data-refs.c >> >> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct >> >> >>> > data_reference *dr) >> >> >>> > base = ref; >> >> >>> > while (handled_component_p (base)) >> >> >>> > base = TREE_OPERAND (base, 0); >> >> >>> > + unsigned int base_alignment; >> >> >>> > + unsigned HOST_WIDE_INT base_bitpos; >> >> >>> > + get_object_alignment_1 (base, &base_alignment, &base_bitpos); >> >> >>> > + /* As data-ref analysis strips the MEM_REF down to its base >> >> >>> > operand >> >> >>> > + to form DR_BASE_ADDRESS and adds the offset to DR_INIT we >> >> >>> > have to >> >> >>> > + adjust things to make base_alignment valid as the alignment of >> >> >>> > + DR_BASE_ADDRESS. */ >> >> >>> > if (TREE_CODE (base) == MEM_REF) >> >> >>> > - base = build2 (MEM_REF, TREE_TYPE (base), base_addr, >> >> >>> > - build_int_cst (TREE_TYPE (TREE_OPERAND (base, >> >> >>> > 1)), 0)); >> >> >>> > - unsigned int base_alignment = get_object_alignment (base); >> >> >>> > + { >> >> >>> > + base_bitpos -= mem_ref_offset (base).to_short_addr () * >> >> >>> > BITS_PER_UNIT; >> >> >>> > + base_bitpos &= (base_alignment - 1); >> >> >>> > + } >> >> >>> > + if (base_bitpos != 0) >> >> >>> > + base_alignment = base_bitpos & -base_bitpos; >> >> >>> > + /* Also look at the alignment of the base address DR analysis >> >> >>> > + computed. */ >> >> >>> > + unsigned int base_addr_alignment = get_pointer_alignment >> >> >>> > (base_addr); >> >> >>> > + if (base_addr_alignment > base_alignment) >> >> >>> > + base_alignment = base_addr_alignment; >> >> >>> > >> >> >>> > if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype))) >> >> >>> > DR_VECT_AUX (dr)->base_element_aligned = true; >> >> >>> >> >> >>> Since you committed this patch (r241892), I'm seeing execution >> >> >>> failures: >> >> >>> gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test >> >> >>> gcc.dg/vect/pr40074.c execution test >> >> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9 >> >> >>> --with-fpu=neon-fp16 >> >> >>> (using qemu as simulator) >> >> >> >> >> >> The difference is that we now vectorize the testcase with versioning >> >> >> for alignment (but it should never execute the vectorized variant). >> >> >> I need arm peoples help to understand what is wrong. >> >> > Hi All, >> >> > I will look at it. >> >> > >> >> >> >> Hi, >> >> >> >> This is causing new regressions on armeb: >> >> gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects >> >> scan-tree-dump-times vect "vectorized 2 loops" 1 >> >> gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect >> >> "vectorized 2 loops" 1 >> > >> > It's actually an improvement as armeb can't do unaligned loads. >> > Before the patch we versioned both loops for alignment (they >> > can't be possibly both aligned by luck). After the patch we >> > align 'arr' and thus the first loop becomes known-aligned and >> > vectorized while the 2nd one is now known unaligned (and thus >> > not vectorized because we can't do unaligned accesses). >> > >> > I suppose the following will fix it. Can you test that on >> > some arm variants please? >> > >> > It works ok on x86_64/i?86 so please commit if it fixes the issue >> > for you. >> > >> >> Well, it works on armeb (the test now passes), but regresses on >> arm-* :( > > You have to help me here, it's not efficient time spent by me > figuring out the right dg-targets here. I can only suppose > that maybe it uses load-lanes and that's happy with unaligned > accesses but hw_misaling is still set to false. Otherwise > I can't see what's endian specific here (other than that > -eb is probably still second class citizen in vectorization > feature enablement?) > > Please open a bugreport against the testsuite in the meantime. >
I understand, and I have just opened pr78421 to keep track of this. Sorry for the delay. Christophe > Richard. > >> Christophe >> >> > Thanks, >> > Richard. >> > >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c >> > b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c >> > index 52fdcf6..3752600 100644 >> > --- a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c >> > +++ b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c >> > @@ -71,5 +71,5 @@ int main (void) >> > return 0; >> > } >> > >> > -/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { >> > target vect_strided2 } } } */ >> > - >> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { >> > target { vect_strided2 && { ! vect_hw_misalign } } } } } */ >> > +/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { >> > target { vect_strided2 && vect_hw_misalign } } } } */ >> >> > > -- > Richard Biener <rguent...@suse.de> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nuernberg)