On 31/03/15 11:36, Richard Biener wrote: > On Tue, 31 Mar 2015, Richard Earnshaw wrote: > >> On 31/03/15 11:00, Richard Biener wrote: >>> On Tue, 31 Mar 2015, Richard Earnshaw wrote: >>> >>>> On 31/03/15 08:50, Richard Biener wrote: >>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguent...@suse.de> >>>>> wrote: >>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence >>>>>> <alan.lawre...@arm.com> wrote: >>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes >>>>>>> it. >>>>>>> >>>>>>> The problem appears to be in laying out arguments, specifically >>>>>>> varargs. From >>>>>>> the "good" -fdump-rtl-expand: >>>>>>> >>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0 >>>>>>> S4 A32]) >>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2) >>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1) >>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) >>>>>>> (reg:SI 118)) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (call_insn 22 21 23 2 (parallel [ >>>>>>> (set (reg:SI 0 r0) >>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] >>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4 >>>>>>> A32]) >>>>>>> >>>>>>> The struct members are >>>>>>> reg:SI 113 => int a; >>>>>>> reg:DF 112 => double b; >>>>>>> reg:SI 111 => int c; >>>>>>> >>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is >>>>>>> pushed >>>>>>> into virtual-outgoing-args. In contrast, post-change to >>>>>>> build_ref_of_offset, we get: >>>>>>> >>>>>>> (insn 17 16 18 2 (set (reg:SI 118) >>>>>>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl 0x2ba57fa8d750 >>>>>>> *.LC1>)) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107 >>>>>>> virtual-outgoing-args) >>>>>>> (const_int 8 [0x8])) [0 S4 A64]) >>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0 >>>>>>> S8 A64]) >>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2) >>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) >>>>>>> (reg:SI 118)) reduced.c:14 -1 >>>>>>> (nil)) >>>>>>> (call_insn 22 21 23 2 (parallel [ >>>>>>> (set (reg:SI 0 r0) >>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] >>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4 >>>>>>> A32]) >>>>>>> >>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the >>>>>>> >>>>>>> double+following int are all pushed into virtual-outgoing-args. This is >>>>>>> because >>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second >>>>>>> argument (the >>>>>>> type constructed by build_ref_for_offset); it then executes >>>>>>> (aapcs_layout_arg, >>>>>>> arm.c line ~~5914) >>>>>>> >>>>>>> /* C3 - For double-word aligned arguments, round the NCRN up to the >>>>>>> next even number. */ >>>>>>> ncrn = pcum->aapcs_ncrn; >>>>>>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type)) >>>>>>> ncrn++; >>>>>>> >>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the >>>>>>> testcase >>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a >>>>>>> 32-bit-aligned int instead, which works as previously. >>>>>>> >>>>>>> Passing the same members of that struct in a non-vargs call, works ok - >>>>>>> I think >>>>>>> because these use the type of the declared parameters, rather than the >>>>>>> provided >>>>>>> arguments, and the former do not have the increased alignment from >>>>>>> build_ref_for_offset. >>>>>> >>>>>> It doesn't make sense to use the alignment of passed values. That looks >>>>>> like bs. >>>>>> >>>>>> This means that >>>>>> >>>>>> Int I __aligned__(8); >>>>>> >>>>>> Is passed differently than int. >>>>>> >>>>>> Arm_function_arg needs to be fixed. >>>>> >>>>> That is, >>>>> >>>>> typedef int myint __attribute__((aligned(8))); >>>>> >>>>> int main() >>>>> { >>>>> myint i = 1; >>>>> int j = 2; >>>>> __builtin_printf("%d %d\n", i, j); >>>>> } >>>>> >>>>> or >>>>> >>>>> myint i; >>>>> int j; >>>>> myint *p = &i; >>>>> int *q = &j; >>>>> >>>>> int main() >>>>> { >>>>> __builtin_printf("%d %d", *p, *q); >>>>> } >>>>> >>>>> should behave the same. There isn't a printf modifier for an "aligned >>>>> int" >>>>> because that sort of thing doesn't make sense. Special-casing aligned vs. >>>>> non-aligned values only makes sense for things passed by value on the >>>>> stack. >>>>> And then obviously only dependent on the functuion type signature, not >>>>> on the type of the passed value. >>>>> >>>> >>>> I think the testcase is ill-formed. Just because printf doesn't have >>>> such a modifier, doesn't mean that another variadic function might not >>>> have a means to detect when an object in the variadic list needs to be >>>> over-aligned. As such, the test should really be written as: >>> >>> A value doesn't have "alignment". A function may have alignment >>> requirements on its arguments, clearly printf doesn't. >>> >> >> Values don't. But types do and variadic functions are special in that >> they derive their types from the types of the actual parameters passed >> not from the formals in the prototype. Any manipulation of the types >> should be done in the front end, not in the back end. > > The following seems to help the testcase (by luck I'd say?). It > makes us drop alignment information from the temporaries that > SRA creates as memory replacement. > > But I find it odd that on ARM passing *((aligned_int *)p) as > vararg (only as varargs?) changes calling conventions independent > of the functions type signature. > > Richard. > > Index: gcc/tree-sra.c > =================================================================== > --- gcc/tree-sra.c (revision 221770) > +++ gcc/tree-sra.c (working copy) > @@ -2012,7 +2012,9 @@ create_access_replacement (struct access > DECL_CONTEXT (repl) = current_function_decl; > } > else > - repl = create_tmp_var (access->type, "SR"); > + repl = create_tmp_var (build_qualified_type > + (TYPE_MAIN_VARIANT (access->type), > + TYPE_QUALS (access->type)), "SR"); > if (TREE_CODE (access->type) == COMPLEX_TYPE > || TREE_CODE (access->type) == VECTOR_TYPE) > { >
I guess we could do something similar (use TYPE_MAIN_VARIANT) in the backend code - but then we'd always ignore any over-alignment requirements (or under, for that matter). R.