On Tue, 31 Mar 2015, Richard Earnshaw wrote: > On 31/03/15 11:36, Richard Biener wrote: > > On Tue, 31 Mar 2015, Richard Earnshaw wrote: > > > >> On 31/03/15 11:00, Richard Biener wrote: > >>> On Tue, 31 Mar 2015, Richard Earnshaw wrote: > >>> > >>>> On 31/03/15 08:50, Richard Biener wrote: > >>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguent...@suse.de> > >>>>> wrote: > >>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence > >>>>>> <alan.lawre...@arm.com> wrote: > >>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes > >>>>>>> it. > >>>>>>> > >>>>>>> The problem appears to be in laying out arguments, specifically > >>>>>>> varargs. From > >>>>>>> the "good" -fdump-rtl-expand: > >>>>>>> > >>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0 > >>>>>>> S4 A32]) > >>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2) > >>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1) > >>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) > >>>>>>> (reg:SI 118)) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (call_insn 22 21 23 2 (parallel [ > >>>>>>> (set (reg:SI 0 r0) > >>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] > >>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf > >>>>>>> S4 > >>>>>>> A32]) > >>>>>>> > >>>>>>> The struct members are > >>>>>>> reg:SI 113 => int a; > >>>>>>> reg:DF 112 => double b; > >>>>>>> reg:SI 111 => int c; > >>>>>>> > >>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is > >>>>>>> pushed > >>>>>>> into virtual-outgoing-args. In contrast, post-change to > >>>>>>> build_ref_of_offset, we get: > >>>>>>> > >>>>>>> (insn 17 16 18 2 (set (reg:SI 118) > >>>>>>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl 0x2ba57fa8d750 > >>>>>>> *.LC1>)) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107 > >>>>>>> virtual-outgoing-args) > >>>>>>> (const_int 8 [0x8])) [0 S4 A64]) > >>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0 > >>>>>>> S8 A64]) > >>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2) > >>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) > >>>>>>> (reg:SI 118)) reduced.c:14 -1 > >>>>>>> (nil)) > >>>>>>> (call_insn 22 21 23 2 (parallel [ > >>>>>>> (set (reg:SI 0 r0) > >>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] > >>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf > >>>>>>> S4 > >>>>>>> A32]) > >>>>>>> > >>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and > >>>>>>> the > >>>>>>> > >>>>>>> double+following int are all pushed into virtual-outgoing-args. This > >>>>>>> is > >>>>>>> because > >>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second > >>>>>>> argument (the > >>>>>>> type constructed by build_ref_for_offset); it then executes > >>>>>>> (aapcs_layout_arg, > >>>>>>> arm.c line ~~5914) > >>>>>>> > >>>>>>> /* C3 - For double-word aligned arguments, round the NCRN up to the > >>>>>>> next even number. */ > >>>>>>> ncrn = pcum->aapcs_ncrn; > >>>>>>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type)) > >>>>>>> ncrn++; > >>>>>>> > >>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the > >>>>>>> testcase > >>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a > >>>>>>> 32-bit-aligned int instead, which works as previously. > >>>>>>> > >>>>>>> Passing the same members of that struct in a non-vargs call, works ok > >>>>>>> - > >>>>>>> I think > >>>>>>> because these use the type of the declared parameters, rather than the > >>>>>>> provided > >>>>>>> arguments, and the former do not have the increased alignment from > >>>>>>> build_ref_for_offset. > >>>>>> > >>>>>> It doesn't make sense to use the alignment of passed values. That > >>>>>> looks like bs. > >>>>>> > >>>>>> This means that > >>>>>> > >>>>>> Int I __aligned__(8); > >>>>>> > >>>>>> Is passed differently than int. > >>>>>> > >>>>>> Arm_function_arg needs to be fixed. > >>>>> > >>>>> That is, > >>>>> > >>>>> typedef int myint __attribute__((aligned(8))); > >>>>> > >>>>> int main() > >>>>> { > >>>>> myint i = 1; > >>>>> int j = 2; > >>>>> __builtin_printf("%d %d\n", i, j); > >>>>> } > >>>>> > >>>>> or > >>>>> > >>>>> myint i; > >>>>> int j; > >>>>> myint *p = &i; > >>>>> int *q = &j; > >>>>> > >>>>> int main() > >>>>> { > >>>>> __builtin_printf("%d %d", *p, *q); > >>>>> } > >>>>> > >>>>> should behave the same. There isn't a printf modifier for an "aligned > >>>>> int" > >>>>> because that sort of thing doesn't make sense. Special-casing aligned > >>>>> vs. > >>>>> non-aligned values only makes sense for things passed by value on the > >>>>> stack. > >>>>> And then obviously only dependent on the functuion type signature, not > >>>>> on the type of the passed value. > >>>>> > >>>> > >>>> I think the testcase is ill-formed. Just because printf doesn't have > >>>> such a modifier, doesn't mean that another variadic function might not > >>>> have a means to detect when an object in the variadic list needs to be > >>>> over-aligned. As such, the test should really be written as: > >>> > >>> A value doesn't have "alignment". A function may have alignment > >>> requirements on its arguments, clearly printf doesn't. > >>> > >> > >> Values don't. But types do and variadic functions are special in that > >> they derive their types from the types of the actual parameters passed > >> not from the formals in the prototype. Any manipulation of the types > >> should be done in the front end, not in the back end. > > > > The following seems to help the testcase (by luck I'd say?). It > > makes us drop alignment information from the temporaries that > > SRA creates as memory replacement. > > > > But I find it odd that on ARM passing *((aligned_int *)p) as > > vararg (only as varargs?) changes calling conventions independent > > of the functions type signature. > > > > Richard. > > > > Index: gcc/tree-sra.c > > =================================================================== > > --- gcc/tree-sra.c (revision 221770) > > +++ gcc/tree-sra.c (working copy) > > @@ -2012,7 +2012,9 @@ create_access_replacement (struct access > > DECL_CONTEXT (repl) = current_function_decl; > > } > > else > > - repl = create_tmp_var (access->type, "SR"); > > + repl = create_tmp_var (build_qualified_type > > + (TYPE_MAIN_VARIANT (access->type), > > + TYPE_QUALS (access->type)), "SR"); > > if (TREE_CODE (access->type) == COMPLEX_TYPE > > || TREE_CODE (access->type) == VECTOR_TYPE) > > { > > > > I guess we could do something similar (use TYPE_MAIN_VARIANT) in the > backend code - but then we'd always ignore any over-alignment > requirements (or under, for that matter).
The question is whether those are even in the scope of AACPS... Richard. -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)