On Tue, 4 Dec 2012, Martin Jambor wrote: > Hi, > > when working at the IPA-SRA misalignment problems I have noticed that > the traditional intraprocedural SRA can also produce this when the > MEM_REF it is after is hidden below a handled component. Fortunately, > now that get_object_alignment looks into MEM_REFS for alignment, we > can use it directly on the given base, as is done in the following > patch. When we know the alignment of the originally requested base we > also only combine it with only the requested offset, as opposed to > also including the offset returned by get_addr_base_and_unit_offset. > > The patch has passed bootstrap and testing on x86_64-linux and > powerpc64-linux, the same is currently in progress on sparc64-linux. > > OK for trunk if it passes there as well? > > Thanks, > > Martin > > > 2012-12-04 Martin Jambor <mjam...@suse.cz> > > PR tree-optimization/55590 > * tree-sra.c (build_ref_for_offset): Use get_object_alignment to > get base alignment. > > * testsuite/gcc.target/i386/pr55590-1.c: New test. > * testsuite/gcc.target/i386/pr55590-2.c: Likewise. > > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.target/i386/pr55590-1.c Tue Dec 4 11:36:54 2012 > *************** > *** 0 **** > --- 1,27 ---- > + /* { dg-do compile } */ > + /* { dg-options "-O2 -mavx" } */ > + > + #include <immintrin.h> > + > + struct S > + { > + __m128 a, b; > + }; > + > + struct T > + { > + int a; > + struct S s; > + }; > + > + > + void foo (struct T *p, __m128 v) > + { > + struct S s; > + > + s = p->s; > + s.b = _mm_add_ps(s.b, v); > + p->s = s; > + } > + > + /* { dg-final { scan-assembler-not "vmovups" } } */ > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.target/i386/pr55590-2.c Tue Dec 4 11:37:02 2012 > *************** > *** 0 **** > --- 1,27 ---- > + /* { dg-do compile } */ > + /* { dg-options "-O2 -mavx" } */ > + > + #include <immintrin.h> > + > + struct S > + { > + __m128 a, b; > + }; > + > + struct T > + { > + int a; > + struct S s[8]; > + }; > + > + > + void foo (struct T *p, int i, __m128 v) > + { > + struct S s; > + > + s = p->s[i]; > + s.b = _mm_add_ps(s.b, v); > + p->s[i] = s; > + } > + > + /* { dg-final { scan-assembler-not "vmovups" } } */ > *** /tmp/nagqRb_tree-sra.c Tue Dec 4 13:24:05 2012 > --- gcc/tree-sra.c Tue Dec 4 13:12:10 2012 > *************** make_fancy_name (tree expr) > *** 1423,1429 **** > EXP_TYPE at the given OFFSET. If BASE is something for which > get_addr_base_and_unit_offset returns NULL, gsi must be non-NULL and is > used > to insert new statements either before or below the current one as > specified > ! by INSERT_AFTER. This function is not capable of handling bitfields. */ > > tree > build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset, > --- 1423,1432 ---- > EXP_TYPE at the given OFFSET. If BASE is something for which > get_addr_base_and_unit_offset returns NULL, gsi must be non-NULL and is > used > to insert new statements either before or below the current one as > specified > ! by INSERT_AFTER. This function is not capable of handling bitfields. > ! > ! BASE must be either a declaration or a memory reference that has correct > ! alignment ifformation embeded in it (e.g. a pre-existing one in SRA). */ > > tree > build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset, > *************** build_ref_for_offset (location_t loc, tr > *** 1433,1440 **** > tree prev_base = base; > tree off; > HOST_WIDE_INT base_offset; > ! unsigned HOST_WIDE_INT misalign; > ! unsigned int align; > > gcc_checking_assert (offset % BITS_PER_UNIT == 0); > > --- 1436,1442 ---- > tree prev_base = base; > tree off; > HOST_WIDE_INT base_offset; > ! unsigned int misalign, align = get_object_alignment (base); > > gcc_checking_assert (offset % BITS_PER_UNIT == 0); > > *************** build_ref_for_offset (location_t loc, tr > *** 1476,1497 **** > base = build_fold_addr_expr (unshare_expr (base)); > } > > ! /* If prev_base were always an originally performed access > ! we can extract more optimistic alignment information > ! by looking at the access mode. That would constrain the > ! alignment of base + base_offset which we would need to > ! adjust according to offset. */ > ! if (!get_pointer_alignment_1 (base, &align, &misalign)) > ! { > ! gcc_assert (misalign == 0); > ! if (TREE_CODE (prev_base) == MEM_REF > ! || TREE_CODE (prev_base) == TARGET_MEM_REF) > ! align = TYPE_ALIGN (TREE_TYPE (prev_base)); > ! } > ! misalign += (tree_to_double_int (off) > ! .sext (TYPE_PRECISION (TREE_TYPE (off))).low > ! * BITS_PER_UNIT); > ! misalign = misalign & (align - 1); > if (misalign != 0) > align = (misalign & -misalign); > if (align < TYPE_ALIGN (exp_type)) > --- 1478,1484 ---- > base = build_fold_addr_expr (unshare_expr (base)); > } > > ! misalign = offset & (align - 1);
Don't you want to use get_object_alignment_1 to record whether 'base' is for example at offset 3 of a 16-byte aligned object so that if 'offset' then makes the result aligned you use that alignment instead of the conservative 1-byte alignment you'd otherwise get from get_object_alignment? That is, how do 'offset' and 'base' relate? As far as I understand align should be computed as the alignment of the access to 'base' plus offset 'offset', no? Thanks, Richard.