On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjam...@suse.cz> wrote: > Hi, > > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote: >> The get_pointer_alignment function can indicate that it does not know >> what the alignment should be, and it always fills in worst-case values >> for that case. We should not use these worst-case values to "optimize" >> the interface of a function. >> >> At minimum I think something like the following would be good. But >> I'm unsure why we would *ever* want to lower the alignment at a function >> interface. It seems to me that we'd simply want the caller to handle >> copying the data to an aligned location? >> >> What was the use case of this code in the first place? > > PR 52402, and not surprisingly, that testcase fails on x86_64-linux > with your patch. Furthermore, misalignment due to MEM_REF offset has > to be accounted for whether we know the alignment of base or not. > > I was hoping that we could do something along the lines of > build_ref_for_offset tree-sra.c but that would not work for cases like > the one from PR 52890, comment 7 when there is no dereference in the > caller, we tranform: > > D.2537 = &this->surfaces; > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, > 1); > > into > > D.2537 = &this->surfaces; > D.2554 = MEM[(const struct ggTrain *)D.2537]; > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, > 1); > > At the moment I hope I'll be able to: > > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in > access_precludes_ipa_sra_p. This way, any knowingly misaligned > load in the callee should will not be transferred to the caller, if > there is none. > > 2) In ipa_modify_call_arguments, be optimistic in like in your patch > except for the case when we are looking at a dereference (i.e. we > are turning a BLK dereference into a smaller scalar type one). In > that case, use its alignment like in build_ref_for_offset. > > But I'd like to think about this a bit more. I believe we should ask > for Richi's approval when he comes back and recovers anyway. But this > is now my highest priority (finally).
The issue here is that IPA SRA, when seeing foo (T *addr) { *addr; } foo (&mem); wanting to transform it to foo' (T value) { value; } value = *(T *)mem; foo (value); has to use the _same_ alignment for the access to mem as it was used inside foo. That's because of the fact that alignment information in GIMPLE is solely present at memory accesses and _not_ in any way associated with pointer types (as opposed to more strict rules imposed by some languages such as C, often enough violated by users, *(char *)(int *)p is an access aligned to at least 4 bytes in C). IPA SRA can use bigger alignment if it knows that is safe (thus get_pointer_alignment returns something larger than what the actual access in foo was using). What IPA SRA at the moment cannot do is "propagate" larger alignment used in foo to the call site (AFAIK). Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site), get_object_alignment (original dereference site)). For fixing pessimizations caused by IPA SRA I suggest to simply punt (not do the transform) if get_pointer_alignment_1 returns false for the actual call argument. Or implement the propagation. Richard. > Thanks, > > Martin > > >> >> >> >> r~ > >> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c >> index 3150bd6..d117389 100644 >> --- a/gcc/ipa-prop.c >> +++ b/gcc/ipa-prop.c >> @@ -2956,15 +2956,17 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, >> gimple stmt, >> unsigned int align; >> unsigned HOST_WIDE_INT misalign; >> >> - get_pointer_alignment_1 (base, &align, &misalign); >> - misalign += (tree_to_double_int (off) >> - .sext (TYPE_PRECISION (TREE_TYPE (off))).low >> - * BITS_PER_UNIT); >> - misalign = misalign & (align - 1); >> - if (misalign != 0) >> - align = (misalign & -misalign); >> - if (align < TYPE_ALIGN (type)) >> - type = build_aligned_type (type, align); >> + if (get_pointer_alignment_1 (base, &align, &misalign)) >> + { >> + misalign += (tree_to_double_int (off) >> + .sext (TYPE_PRECISION (TREE_TYPE (off))).low >> + * BITS_PER_UNIT); >> + misalign = misalign & (align - 1); >> + if (misalign != 0) >> + align = (misalign & -misalign); >> + if (align < TYPE_ALIGN (type)) >> + type = build_aligned_type (type, align); >> + } >> expr = fold_build2_loc (loc, MEM_REF, type, base, off); >> } >> else >