Richard Earnshaw wrote:
On 31/03/15 11:45, Richard Biener wrote:
On Tue, 31 Mar 2015, Richard Earnshaw wrote:
On 31/03/15 11:36, Richard Biener wrote:
On Tue, 31 Mar 2015, Richard Earnshaw wrote:
On 31/03/15 11:00, Richard Biener wrote:
On Tue, 31 Mar 2015, Richard Earnshaw wrote:
On 31/03/15 08:50, Richard Biener wrote:
On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguent...@suse.de> wrote:
On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawre...@arm.com>
wrote:
-O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
it.
The problem appears to be in laying out arguments, specifically
varargs. From
the "good" -fdump-rtl-expand:
(insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
S4 A32])
(reg:SI 111 [ b1$16 ])) reduced.c:14 -1
(nil))
(insn 19 18 20 2 (set (reg:DF 2 r2)
(reg:DF 112 [ b1$8 ])) reduced.c:14 -1
(nil))
(insn 20 19 21 2 (set (reg:SI 1 r1)
(reg:SI 113 [ b1 ])) reduced.c:14 -1
(nil))
(insn 21 20 22 2 (set (reg:SI 0 r0)
(reg:SI 118)) reduced.c:14 -1
(nil))
(call_insn 22 21 23 2 (parallel [
(set (reg:SI 0 r0)
(call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
<function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
A32])
The struct members are
reg:SI 113 => int a;
reg:DF 112 => double b;
reg:SI 111 => int c;
r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
pushed
into virtual-outgoing-args. In contrast, post-change to
build_ref_of_offset, we get:
(insn 17 16 18 2 (set (reg:SI 118)
(symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl 0x2ba57fa8d750
*.LC1>)) reduced.c:14 -1
(nil))
(insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
virtual-outgoing-args)
(const_int 8 [0x8])) [0 S4 A64])
(reg:SI 111 [ b1$16 ])) reduced.c:14 -1
(nil))
(insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
S8 A64])
(reg:DF 112 [ b1$8 ])) reduced.c:14 -1
(nil))
(insn 20 19 21 2 (set (reg:SI 2 r2)
(reg:SI 113 [ b1 ])) reduced.c:14 -1
(nil))
(insn 21 20 22 2 (set (reg:SI 0 r0)
(reg:SI 118)) reduced.c:14 -1
(nil))
(call_insn 22 21 23 2 (parallel [
(set (reg:SI 0 r0)
(call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
<function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
A32])
r0 still gets the format string, but 'int b1.a' now goes in r2, and the
double+following int are all pushed into virtual-outgoing-args. This is
because
arm_function_arg is fed a 64-bit-aligned int as type of the second
argument (the
type constructed by build_ref_for_offset); it then executes
(aapcs_layout_arg,
arm.c line ~~5914)
/* C3 - For double-word aligned arguments, round the NCRN up to the
next even number. */
ncrn = pcum->aapcs_ncrn;
if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
ncrn++;
Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
testcase
"*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
32-bit-aligned int instead, which works as previously.
Passing the same members of that struct in a non-vargs call, works ok -
I think
because these use the type of the declared parameters, rather than the
provided
arguments, and the former do not have the increased alignment from
build_ref_for_offset.
It doesn't make sense to use the alignment of passed values. That looks like
bs.
This means that
Int I __aligned__(8);
Is passed differently than int.
Arm_function_arg needs to be fixed.
That is,
typedef int myint __attribute__((aligned(8)));
int main()
{
myint i = 1;
int j = 2;
__builtin_printf("%d %d\n", i, j);
}
or
myint i;
int j;
myint *p = &i;
int *q = &j;
int main()
{
__builtin_printf("%d %d", *p, *q);
}
should behave the same. There isn't a printf modifier for an "aligned int"
because that sort of thing doesn't make sense. Special-casing aligned vs.
non-aligned values only makes sense for things passed by value on the stack.
And then obviously only dependent on the functuion type signature, not
on the type of the passed value.
I think the testcase is ill-formed. Just because printf doesn't have
such a modifier, doesn't mean that another variadic function might not
have a means to detect when an object in the variadic list needs to be
over-aligned. As such, the test should really be written as:
A value doesn't have "alignment". A function may have alignment
requirements on its arguments, clearly printf doesn't.
Values don't. But types do and variadic functions are special in that
they derive their types from the types of the actual parameters passed
not from the formals in the prototype. Any manipulation of the types
should be done in the front end, not in the back end.
The following seems to help the testcase (by luck I'd say?). It
makes us drop alignment information from the temporaries that
SRA creates as memory replacement.
But I find it odd that on ARM passing *((aligned_int *)p) as
vararg (only as varargs?) changes calling conventions independent
of the functions type signature.
Richard.
Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c (revision 221770)
+++ gcc/tree-sra.c (working copy)
@@ -2012,7 +2012,9 @@ create_access_replacement (struct access
DECL_CONTEXT (repl) = current_function_decl;
}
else
- repl = create_tmp_var (access->type, "SR");
+ repl = create_tmp_var (build_qualified_type
+ (TYPE_MAIN_VARIANT (access->type),
+ TYPE_QUALS (access->type)), "SR");
if (TREE_CODE (access->type) == COMPLEX_TYPE
|| TREE_CODE (access->type) == VECTOR_TYPE)
{
I guess we could do something similar (use TYPE_MAIN_VARIANT) in the
backend code - but then we'd always ignore any over-alignment
requirements (or under, for that matter).
The question is whether those are even in the scope of AACPS...
Technically, they aren't. The argument passing rules are all based on
the alignment rules for fundamental types (and derived rules for
composite types based on those fundamental types) written in the AAPCS
itself. There's no provision for over-aligning a data type, so all of
this is off-piste.
R.
FWIW, I've just tested arm-none-linux-gnueabi with:
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 48342d0..37662f4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6020,7 +6020,7 @@ static bool
arm_needs_doubleword_align (machine_mode mode, const_tree type)
{
return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
- || (type && TYPE_ALIGN (type) > PARM_BOUNDARY));
+ || (type && TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY));
}
and found no regressions (and yes this fixes the original case). Whether this is
the right way to do it however is another question!
--Alan