On 24/11/15 14:36, Jiong Wang wrote: > > > On 24/11/15 13:23, Richard Earnshaw wrote: >> On 24/11/15 13:06, Jiong Wang wrote: >>> >>> On 24/11/15 10:18, Richard Earnshaw wrote: >>>> I presume you are aware of the canonicalization rules for add? That >>>> is, >>>> for a shift-and-add operation, the shift operand must appear first. >>>> Ie. >>>> >>>> (plus (shift (op, op)), op) >>>> >>>> not >>>> >>>> (plus (op, (shift (op, op)) >>>> >>>> R. >>> Looks to me it's not optimal to generate invalid mem addr, for example >>> (mem (plus reg, (mult reg, imm))) or even the simple (mem (plus (plus r, >>> r), imm), >>> in the first place. Those complex rtx inside is hidden by the permissive >>> memory_operand predication, and only exposed during reload by stricter >>> constraints, then reload need to extra work. If we expose those >>> complex rtx >>> earlier then some earlier rtl pass may find more optimization >>> opportunities, for >>> example combine. >>> >>> The following simple modification fix the ICE and generates best >>> sequences to me: >>> >>> - return gen_rtx_fmt_ee (PLUS, addr_mode, base, op1); >>> + addr = gen_rtx_fmt_ee (PLUS, addr_mode, op1, base); >>> + emit_insn (gen_rtx_SET (base, addr)); >>> + return base; >>> >> That wouldn't be right either if op1 could be a const_int. > > Indeed, though it would be strange to me if op1 would be const_int here, > given > those early if check, if it is, then the incoming address rtx would be > something > like the following without two const_int folded. > > + > / \ > / \ > + const_int > / \ > / \ > Reg const_int > > or we could sync the rtx checked in the early > aarch64_legitimate_address_hook_p, > it will return false if op1 be const_int. > > - rtx addr = gen_rtx_fmt_ee (PLUS, addr_mode, base, op1); > + rtx addr = gen_rtx_fmt_ee (PLUS, addr_mode, op1, base); > >
The safest thing is to insert a call to swap_commutative_operands_p and then switch the order over if that returns true. R. >> >> R. >> >>> 67 add x1, x29, 48 >>> 68 add x1, x1, x0, sxtw 3 >>> 69 stlr x19, [x1] >>> >>> instead of >>> >>> 67 add x1, x29, 64 >>> 68 add x0, x1, x0, sxtw 3 >>> 69 sub x0, x0, #16 >>> 70 stlr x19, [x0] >>> >>> or >>> >>> 67 sxtw x0, w0 >>> 68 add x1, x29, 48 >>> 69 add x1, x1, x0, sxtw 3 >>> 70 stlr x19, [x1] >>> >>> >>> >>> >>> >