On Tue, 31 Mar 2015, Richard Earnshaw wrote: > On 31/03/15 11:45, Richard Biener wrote: > > On Tue, 31 Mar 2015, Richard Earnshaw wrote: > > > >> On 31/03/15 11:36, Richard Biener wrote: > >>> On Tue, 31 Mar 2015, Richard Earnshaw wrote: > >>> > >>>> On 31/03/15 11:00, Richard Biener wrote: > >>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote: > >>>>> > >>>>>> On 31/03/15 08:50, Richard Biener wrote: > >>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguent...@suse.de> > >>>>>>> wrote: > >>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence > >>>>>>>> <alan.lawre...@arm.com> wrote: > >>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra > >>>>>>>>> fixes > >>>>>>>>> it. > >>>>>>>>> > >>>>>>>>> The problem appears to be in laying out arguments, specifically > >>>>>>>>> varargs. From > >>>>>>>>> the "good" -fdump-rtl-expand: > >>>>>>>>> > >>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) > >>>>>>>>> [0 > >>>>>>>>> S4 A32]) > >>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2) > >>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1) > >>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) > >>>>>>>>> (reg:SI 118)) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (call_insn 22 21 23 2 (parallel [ > >>>>>>>>> (set (reg:SI 0 r0) > >>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] > >>>>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 > >>>>>>>>> __builtin_printf S4 > >>>>>>>>> A32]) > >>>>>>>>> > >>>>>>>>> The struct members are > >>>>>>>>> reg:SI 113 => int a; > >>>>>>>>> reg:DF 112 => double b; > >>>>>>>>> reg:SI 111 => int c; > >>>>>>>>> > >>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c > >>>>>>>>> is > >>>>>>>>> pushed > >>>>>>>>> into virtual-outgoing-args. In contrast, post-change to > >>>>>>>>> build_ref_of_offset, we get: > >>>>>>>>> > >>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118) > >>>>>>>>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl > >>>>>>>>> 0x2ba57fa8d750 > >>>>>>>>> *.LC1>)) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107 > >>>>>>>>> virtual-outgoing-args) > >>>>>>>>> (const_int 8 [0x8])) [0 S4 A64]) > >>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) > >>>>>>>>> [0 > >>>>>>>>> S8 A64]) > >>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2) > >>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0) > >>>>>>>>> (reg:SI 118)) reduced.c:14 -1 > >>>>>>>>> (nil)) > >>>>>>>>> (call_insn 22 21 23 2 (parallel [ > >>>>>>>>> (set (reg:SI 0 r0) > >>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41] > >>>>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 > >>>>>>>>> __builtin_printf S4 > >>>>>>>>> A32]) > >>>>>>>>> > >>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and > >>>>>>>>> the > >>>>>>>>> > >>>>>>>>> double+following int are all pushed into virtual-outgoing-args. > >>>>>>>>> This is > >>>>>>>>> because > >>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second > >>>>>>>>> argument (the > >>>>>>>>> type constructed by build_ref_for_offset); it then executes > >>>>>>>>> (aapcs_layout_arg, > >>>>>>>>> arm.c line ~~5914) > >>>>>>>>> > >>>>>>>>> /* C3 - For double-word aligned arguments, round the NCRN up to > >>>>>>>>> the > >>>>>>>>> next even number. */ > >>>>>>>>> ncrn = pcum->aapcs_ncrn; > >>>>>>>>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type)) > >>>>>>>>> ncrn++; > >>>>>>>>> > >>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the > >>>>>>>>> testcase > >>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be > >>>>>>>>> fed a > >>>>>>>>> 32-bit-aligned int instead, which works as previously. > >>>>>>>>> > >>>>>>>>> Passing the same members of that struct in a non-vargs call, works > >>>>>>>>> ok - > >>>>>>>>> I think > >>>>>>>>> because these use the type of the declared parameters, rather than > >>>>>>>>> the > >>>>>>>>> provided > >>>>>>>>> arguments, and the former do not have the increased alignment from > >>>>>>>>> build_ref_for_offset. > >>>>>>>> > >>>>>>>> It doesn't make sense to use the alignment of passed values. That > >>>>>>>> looks like bs. > >>>>>>>> > >>>>>>>> This means that > >>>>>>>> > >>>>>>>> Int I __aligned__(8); > >>>>>>>> > >>>>>>>> Is passed differently than int. > >>>>>>>> > >>>>>>>> Arm_function_arg needs to be fixed. > >>>>>>> > >>>>>>> That is, > >>>>>>> > >>>>>>> typedef int myint __attribute__((aligned(8))); > >>>>>>> > >>>>>>> int main() > >>>>>>> { > >>>>>>> myint i = 1; > >>>>>>> int j = 2; > >>>>>>> __builtin_printf("%d %d\n", i, j); > >>>>>>> } > >>>>>>> > >>>>>>> or > >>>>>>> > >>>>>>> myint i; > >>>>>>> int j; > >>>>>>> myint *p = &i; > >>>>>>> int *q = &j; > >>>>>>> > >>>>>>> int main() > >>>>>>> { > >>>>>>> __builtin_printf("%d %d", *p, *q); > >>>>>>> } > >>>>>>> > >>>>>>> should behave the same. There isn't a printf modifier for an > >>>>>>> "aligned int" > >>>>>>> because that sort of thing doesn't make sense. Special-casing > >>>>>>> aligned vs. > >>>>>>> non-aligned values only makes sense for things passed by value on the > >>>>>>> stack. > >>>>>>> And then obviously only dependent on the functuion type signature, not > >>>>>>> on the type of the passed value. > >>>>>>> > >>>>>> > >>>>>> I think the testcase is ill-formed. Just because printf doesn't have > >>>>>> such a modifier, doesn't mean that another variadic function might not > >>>>>> have a means to detect when an object in the variadic list needs to be > >>>>>> over-aligned. As such, the test should really be written as: > >>>>> > >>>>> A value doesn't have "alignment". A function may have alignment > >>>>> requirements on its arguments, clearly printf doesn't. > >>>>> > >>>> > >>>> Values don't. But types do and variadic functions are special in that > >>>> they derive their types from the types of the actual parameters passed > >>>> not from the formals in the prototype. Any manipulation of the types > >>>> should be done in the front end, not in the back end. > >>> > >>> The following seems to help the testcase (by luck I'd say?). It > >>> makes us drop alignment information from the temporaries that > >>> SRA creates as memory replacement. > >>> > >>> But I find it odd that on ARM passing *((aligned_int *)p) as > >>> vararg (only as varargs?) changes calling conventions independent > >>> of the functions type signature. > >>> > >>> Richard. > >>> > >>> Index: gcc/tree-sra.c > >>> =================================================================== > >>> --- gcc/tree-sra.c (revision 221770) > >>> +++ gcc/tree-sra.c (working copy) > >>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access > >>> DECL_CONTEXT (repl) = current_function_decl; > >>> } > >>> else > >>> - repl = create_tmp_var (access->type, "SR"); > >>> + repl = create_tmp_var (build_qualified_type > >>> + (TYPE_MAIN_VARIANT (access->type), > >>> + TYPE_QUALS (access->type)), "SR"); > >>> if (TREE_CODE (access->type) == COMPLEX_TYPE > >>> || TREE_CODE (access->type) == VECTOR_TYPE) > >>> { > >>> > >> > >> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the > >> backend code - but then we'd always ignore any over-alignment > >> requirements (or under, for that matter). > > > > The question is whether those are even in the scope of AACPS... > > > > Technically, they aren't. The argument passing rules are all based on > the alignment rules for fundamental types (and derived rules for > composite types based on those fundamental types) written in the AAPCS > itself. There's no provision for over-aligning a data type, so all of > this is off-piste.
So in arm_needs_doubleword_align you could do /* Return true if mode/type need doubleword alignment. */ static bool arm_needs_doubleword_align (machine_mode mode, const_tree type) { return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY || (type && (AGGREGATE_TYPE_P (type) || mode == BLKmode) && TYPE_ALIGN (type) > PARM_BOUNDARY)); } ? Thus always trust the 'mode' when it doesn't apply to an aggregate? Richard. -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)