On 24/11/15 13:23, Richard Earnshaw wrote:
On 24/11/15 13:06, Jiong Wang wrote:
On 24/11/15 10:18, Richard Earnshaw wrote:
I presume you are aware of the canonicalization rules for add? That is,
for a shift-and-add operation, the shift operand must appear first. Ie.
(plus (shift (op, op)), op)
not
(plus (op, (shift (op, op))
R.
Looks to me it's not optimal to generate invalid mem addr, for example
(mem (plus reg, (mult reg, imm))) or even the simple (mem (plus (plus r,
r), imm),
in the first place. Those complex rtx inside is hidden by the permissive
memory_operand predication, and only exposed during reload by stricter
constraints, then reload need to extra work. If we expose those complex rtx
earlier then some earlier rtl pass may find more optimization
opportunities, for
example combine.
The following simple modification fix the ICE and generates best
sequences to me:
- return gen_rtx_fmt_ee (PLUS, addr_mode, base, op1);
+ addr = gen_rtx_fmt_ee (PLUS, addr_mode, op1, base);
+ emit_insn (gen_rtx_SET (base, addr));
+ return base;
That wouldn't be right either if op1 could be a const_int.
Indeed, though it would be strange to me if op1 would be const_int here,
given
those early if check, if it is, then the incoming address rtx would be
something
like the following without two const_int folded.
+
/ \
/ \
+ const_int
/ \
/ \
Reg const_int
or we could sync the rtx checked in the early
aarch64_legitimate_address_hook_p,
it will return false if op1 be const_int.
- rtx addr = gen_rtx_fmt_ee (PLUS, addr_mode, base, op1);
+ rtx addr = gen_rtx_fmt_ee (PLUS, addr_mode, op1, base);
R.
67 add x1, x29, 48
68 add x1, x1, x0, sxtw 3
69 stlr x19, [x1]
instead of
67 add x1, x29, 64
68 add x0, x1, x0, sxtw 3
69 sub x0, x0, #16
70 stlr x19, [x0]
or
67 sxtw x0, w0
68 add x1, x29, 48
69 add x1, x1, x0, sxtw 3
70 stlr x19, [x1]