Hi, GIMPLE IVO needs to call backend interface to calculate costs for addr expressions like below: FORM1: "r73 + r74 + 16380" FORM2: "r73 << 2 + r74 + 16380"
They are invalid address expression on AArch64, so will be legitimized by aarch64_legitimize_address. Below are what we got from that function: For FORM1, the address expression is legitimized into below insn sequence and rtx: r84:DI=r73:DI+r74:DI r85:DI=r84:DI+0x3000 r83:DI=r85:DI "r83 + 4092" For FORM2, the address expression is legitimized into below insn sequence and rtx: r108:DI=r73:DI<<0x2 r109:DI=r108:DI+r74:DI r110:DI=r109:DI+0x3000 r107:DI=r110:DI "r107 + 4092" So the costs computed are 12/16 respectively. The high cost prevents IVO from choosing right candidates. Besides cost computation, I also think the legitmization is bad in terms of code generation. The root cause in aarch64_legitimize_address can be described by it's comment: /* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask), where mask is selected by alignment and size of the offset. We try to pick as large a range for the offset as possible to maximize the chance of a CSE. However, for aligned addresses we limit the range to 4k so that structures with different sized elements are likely to use the same base. */ I think the split of CONST is intended for REG+CONST where the const offset is not in the range of AArch64's addressing modes. Unfortunately, it doesn't explicitly handle/reject "REG+REG+CONST" and "REG+REG<<SCALE+CONST" when the CONST are in the range of addressing modes. As a result, these two cases fallthrough this logic, resulting in sub-optimal results. It's obvious we can do below legitimization: FORM1: r83:DI=r73:DI+r74:DI "r83 + 16380" FORM2: r107:DI=0x3ffc r106:DI=r74:DI+r107:DI REG_EQUAL r74:DI+0x3ffc "r106 + r73 << 2" This patch handles these two cases as described. Bootstrap & test on AArch64 along with other patch. Is it OK? 2015-11-04 Bin Cheng <bin.ch...@arm.com> Jiong Wang <jiong.w...@arm.com> * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle address expressions like REG+REG+CONST and REG+NON_REG+CONST.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5c8604f..47875ac 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x */, machine_mode mode) { HOST_WIDE_INT offset = INTVAL (XEXP (x, 1)); HOST_WIDE_INT base_offset; + rtx op0 = XEXP (x,0); + + if (GET_CODE (op0) == PLUS) + { + rtx op0_ = XEXP (op0, 0); + rtx op1_ = XEXP (op0, 1); + + /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will + reach here, the 'CONST' may be valid in which case we should + not split. */ + if (REG_P (op0_) && REG_P (op1_)) + { + machine_mode addr_mode = GET_MODE (op0); + rtx addr = gen_reg_rtx (addr_mode); + + rtx ret = plus_constant (addr_mode, addr, offset); + if (aarch64_legitimate_address_hook_p (mode, ret, false)) + { + emit_insn (gen_adddi3 (addr, op0_, op1_)); + return ret; + } + } + /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST) + will reach here. If (PLUS REG, NON_REG) is valid addr expr, + we split it into Y=REG+CONST, Y+NON_REG. */ + else if (REG_P (op0_) || REG_P (op1_)) + { + machine_mode addr_mode = GET_MODE (op0); + rtx addr = gen_reg_rtx (addr_mode); + + /* Switch to make sure that register is in op0_. */ + if (REG_P (op1_)) + std::swap (op0_, op1_); + + rtx ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_); + if (aarch64_legitimate_address_hook_p (mode, ret, false)) + { + addr = force_operand (plus_constant (addr_mode, + op0_, offset), + NULL_RTX); + ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_); + return ret; + } + } + } /* Does it look like we'll need a load/store-pair operation? */ if (GET_MODE_SIZE (mode) > 16