At Segher's suggestion, I looked into changing the predicates on 
bswapdi2_{load,store}
from memory_operand to indexed_or_indirect_operand and putting some code into 
bswapdi2
to make the address indirect if it wasn't already.

The motivating case for this was the code I was seeing for the gpr expansion of 
strncmp.
Before I would typically see something like this:

        addi 9,3,8
        ldbrx 10,0,9
        addi 9,4,8
        ldbrx 8,0,9
        subf. 9,8,10
        bne 0,.L13
        cmpb 10,10,9
        cmpdi 0,10,0
        bne 0,.L9
        addi 9,3,16
        ldbrx 10,0,9
        addi 9,4,16
        ldbrx 8,0,9
        subf. 9,8,10
        bne 0,.L13
        cmpb 10,10,9
        cmpdi 0,10,0
        bne 0,.L9

For each comparison block it is doing the add separately and using 0 for one 
input
of the ldbrx.

After this change, it is more like this:

        ldbrx 8,3,27
        ldbrx 7,4,27
        cmpb 9,8,9
        cmpb 10,8,7
        orc. 9,9,10
        bne 0,.L13
        ldbrx 8,3,24
        ldbrx 7,4,24
        cmpb 10,8,9
        cmpb 9,8,7
        orc. 9,10,9
        bne 0,.L13


Here it has created temps with constants and hoisted them out of a loop, but I 
have
other cases where it will update them if there is more register pressure. in 
either
case the code is more compact and makes full use of the indexed addressing of 
ldbrx.

Bootstrap/regtest passed on ppc64le targeting power7/power8/power9, ok for 
trunk?

Thanks!
    Aaron

2018-10-27  Aaron Sawdey  <acsaw...@linux.ibm.com>

        * config/rs6000/rs6000.md (bswapdi2): Force address into register
        if not in one already.
        (bswapdi2_load): Change predicate to indexed_or_indirect_operand.
        (bswapdi2_store): Ditto.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 265393)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -2512,9 +2512,27 @@
   if (TARGET_POWERPC64 && TARGET_LDBRX)
     {
       if (MEM_P (src))
-       emit_insn (gen_bswapdi2_load (dest, src));
+        {
+          rtx addr = XEXP (src, 0);
+          if (!legitimate_indirect_address_p (addr, reload_completed)
+              && !legitimate_indexed_address_p (addr, reload_completed))
+            {
+              addr = force_reg (Pmode, addr);
+              src = replace_equiv_address_nv (src, addr);
+            }
+         emit_insn (gen_bswapdi2_load (dest, src));
+        }
       else if (MEM_P (dest))
-       emit_insn (gen_bswapdi2_store (dest, src));
+        {
+          rtx addr = XEXP (dest, 0);
+          if (!legitimate_indirect_address_p (addr, reload_completed)
+              && !legitimate_indexed_address_p (addr, reload_completed))
+            {
+              addr = force_reg (Pmode, addr);
+              dest = replace_equiv_address_nv (dest, addr);
+            }
+         emit_insn (gen_bswapdi2_store (dest, src));
+        }
       else if (TARGET_P9_VECTOR)
        emit_insn (gen_bswapdi2_xxbrd (dest, src));
       else
@@ -2535,13 +2553,13 @@
 ;; Power7/cell has ldbrx/stdbrx, so use it directly
 (define_insn "bswapdi2_load"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
-       (bswap:DI (match_operand:DI 1 "memory_operand" "Z")))]
+       (bswap:DI (match_operand:DI 1 "indexed_or_indirect_operand" "Z")))]
   "TARGET_POWERPC64 && TARGET_LDBRX"
   "ldbrx %0,%y1"
   [(set_attr "type" "load")])

 (define_insn "bswapdi2_store"
-  [(set (match_operand:DI 0 "memory_operand" "=Z")
+  [(set (match_operand:DI 0 "indexed_or_indirect_operand" "=Z")
        (bswap:DI (match_operand:DI 1 "gpc_reg_operand" "r")))]
   "TARGET_POWERPC64 && TARGET_LDBRX"
   "stdbrx %1,%y0"




-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Reply via email to