> The following experiment resulted from looking at making > array_ref_low_bound and array_ref_element_size non-mutating. Again > I wondered why we do this strange scaling by offset/element alignment.
The idea is to expose the alignment factor to the RTL expander: tree tem = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1, &unsignedp, &reversep, &volatilep, true); [...] rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode, EXPAND_SUM); [...] op0 = offset_address (op0, offset_rtx, highest_pow2_factor (offset)); With the scaling, offset is something like _69 * 4 so highest_pow2_factor can see the factor and passes it down to offset_address: (gdb) p debug_rtx(op0) (mem/c:SI (plus:SI (reg/f:SI 193) (reg:SI 194)) [3 *s.16_63 S4 A32]) With your patch in the same situation: (gdb) p debug_rtx(op0) (mem/c:SI (plus:SI (reg/f:SI 139) (reg:SI 116 [ _33 ])) [3 *s.16_63 S4 A8]) On strict-alignment targets, this makes a big difference, e.g. SPARC: ld [%i4+%i5], %i0 vs ldub [%i5+%i4], %g1 sll %g1, 24, %g1 add %i5, %i4, %i5 ldub [%i5+1], %i0 sll %i0, 16, %i0 or %i0, %g1, %i0 ldub [%i5+2], %g1 sll %g1, 8, %g1 or %g1, %i0, %g1 ldub [%i5+3], %i0 or %i0, %g1, %i0 Now this is mitigated by a couple of things: 1. the above pessimization only happens on the RHS; on the LHS, the expander calls highest_pow2_factor_for_target instead of highest_pow2_factor and the former takes into account the type's alignment thanks to the MAX: /* Similar, except that the alignment requirements of TARGET are taken into account. Assume it is at least as aligned as its type, unless it is a COMPONENT_REF in which case the layout of the structure gives the alignment. */ static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree target, const_tree exp) { unsigned HOST_WIDE_INT talign = target_align (target) / BITS_PER_UNIT; unsigned HOST_WIDE_INT factor = highest_pow2_factor (exp); return MAX (factor, talign); } 2. highest_pow2_factor can be rescued by the set_nonzero_bits machinery of the SSA CCP pass because it calls tree_ctz. The above example was compiled with -O -fno-tree-ccp on SPARC; at -O, the code isn't pessimized. > So - the following patch gets rid of that scaling. For a "simple" > C testcase > > void bar (void *); > void foo (int n) > { > struct S { struct R { int b[n]; } a[2]; int k; } s; > s.k = 1; > s.a[1].b[7] = 3; > bar (&s); > } This only exposes the LHS case, here's a more complete testcase: void bar (void *); int foo (int n) { struct S { struct R { char b[n]; } a[2]; int k; } s; s.k = 1; s.a[1].b[7] = 3; bar (&s); return s.k; } -- Eric Botcazou