Hi,
GIMPLE IVO needs to call backend interface to calculate costs for addr
expressions like below:
   FORM1: "r73 + r74 + 16380"
   FORM2: "r73 << 2 + r74 + 16380"

They are invalid address expression on AArch64, so will be legitimized by
aarch64_legitimize_address.  Below are what we got from that function:

For FORM1, the address expression is legitimized into below insn sequence
and rtx:
   r84:DI=r73:DI+r74:DI
   r85:DI=r84:DI+0x3000
   r83:DI=r85:DI
   "r83 + 4092"

For FORM2, the address expression is legitimized into below insn sequence
and rtx:
   r108:DI=r73:DI<<0x2
   r109:DI=r108:DI+r74:DI
   r110:DI=r109:DI+0x3000
   r107:DI=r110:DI
   "r107 + 4092"

So the costs computed are 12/16 respectively.  The high cost prevents IVO
from choosing right candidates.  Besides cost computation, I also think the
legitmization is bad in terms of code generation.
The root cause in aarch64_legitimize_address can be described by it's
comment:
   /* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask),
      where mask is selected by alignment and size of the offset.
      We try to pick as large a range for the offset as possible to
      maximize the chance of a CSE.  However, for aligned addresses
      we limit the range to 4k so that structures with different sized
      elements are likely to use the same base.  */
I think the split of CONST is intended for REG+CONST where the const offset
is not in the range of AArch64's addressing modes.  Unfortunately, it
doesn't explicitly handle/reject "REG+REG+CONST" and "REG+REG<<SCALE+CONST"
when the CONST are in the range of addressing modes.  As a result, these two
cases fallthrough this logic, resulting in sub-optimal results.

It's obvious we can do below legitimization:
FORM1:
   r83:DI=r73:DI+r74:DI
   "r83 + 16380"
FORM2:
   r107:DI=0x3ffc
   r106:DI=r74:DI+r107:DI
      REG_EQUAL r74:DI+0x3ffc
   "r106 + r73 << 2"

This patch handles these two cases as described.
Bootstrap & test on AArch64 along with other patch.  Is it OK?

2015-11-04  Bin Cheng  <bin.ch...@arm.com>
            Jiong Wang  <jiong.w...@arm.com>

        * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle
        address expressions like REG+REG+CONST and REG+NON_REG+CONST.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5c8604f..47875ac 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
machine_mode mode)
     {
       HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
       HOST_WIDE_INT base_offset;
+      rtx op0 = XEXP (x,0);
+
+      if (GET_CODE (op0) == PLUS)
+       {
+         rtx op0_ = XEXP (op0, 0);
+         rtx op1_ = XEXP (op0, 1);
+
+         /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will
+            reach here, the 'CONST' may be valid in which case we should
+            not split.  */
+         if (REG_P (op0_) && REG_P (op1_))
+           {
+             machine_mode addr_mode = GET_MODE (op0);
+             rtx addr = gen_reg_rtx (addr_mode);
+
+             rtx ret = plus_constant (addr_mode, addr, offset);
+             if (aarch64_legitimate_address_hook_p (mode, ret, false))
+               {
+                 emit_insn (gen_adddi3 (addr, op0_, op1_));
+                 return ret;
+               }
+           }
+         /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
+            will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
+            we split it into Y=REG+CONST, Y+NON_REG.  */
+         else if (REG_P (op0_) || REG_P (op1_))
+           {
+             machine_mode addr_mode = GET_MODE (op0);
+             rtx addr = gen_reg_rtx (addr_mode);
+
+             /* Switch to make sure that register is in op0_.  */
+             if (REG_P (op1_))
+               std::swap (op0_, op1_);
+
+             rtx ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
+             if (aarch64_legitimate_address_hook_p (mode, ret, false))
+               {
+                 addr = force_operand (plus_constant (addr_mode,
+                                                      op0_, offset),
+                                       NULL_RTX);
+                 ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
+                 return ret;
+               }
+           }
+       }
 
       /* Does it look like we'll need a load/store-pair operation?  */
       if (GET_MODE_SIZE (mode) > 16

Reply via email to