Hi,

The patch tries to use REG_EQUAL to get more precise info for nonzero_bits,
which helps to remove unnecessary zero_extend.

Here is an example when compiling Coremark, we have rtx like,

(insn 1244 386 388 47 (set (reg:SI 263 [ D.5767 ])
        (reg:SI 384 [ D.5767 ])) 786 {*thumb2_movsi_insn}
     (expr_list:REG_EQUAL (zero_extend:SI (mem:QI (reg/v/f:SI 271 [ memblock
]) [0 *memblock_13(D)+0 S1 A8]))
        (nil)))

from "reg:SI 384", we can only know it is a 32-bit value. But from
REG_EQUAL, we can know it is an 8-bit value. Then for the following rtx seq,

(insn 409 407 410 50 (set (reg:SI 308)
        (plus:SI (reg:SI 263 [ D.5767 ])
            (const_int -48 [0xffffffffffffffd0]))) core_state.c:170 4
{*arm_addsi3}
     (nil))
(insn 410 409 411 50 (set (reg:SI 309)
        (zero_extend:SI (subreg:QI (reg:SI 308) 0))) core_state.c:170 812
{thumb2_zero_extendqisi2_v6}
     (expr_list:REG_DEAD (reg:SI 308)
        (nil)))

the zero_extend for r309 can be optimized by combine pass.

Bootstrap and no make check regression on X86-64.
No make check regression on Cortex-M4 qemu.
No Spec2K INT regression on X86-64 and Cortex-A15 with -O3.
Coremark on Cortex-M7 is 0.3% better.
Coremark on Cortex-M4 is 0.07% regression due to alignment change.
No Coremark change on Corter-M0 and Cortex-A15.

Unfortunately I failed to generate a meaningful small case for it. So no
test case is included in the patch.

Ok for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-11-21  Zhenqiang Chen  <zhenqiang.c...@arm.com>

        * combine.c (set_nonzero_bits_and_sign_copies): Try REG_EQUAL note.

diff --git a/gcc/combine.c b/gcc/combine.c
index 6a7d16b..68a883b 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx
set, void *data)
 
          /* Don't call nonzero_bits if it cannot change anything.  */
          if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0)
-           rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode);
+           {
+             rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL,
NULL_RTX)
+                                    : NULL_RTX;
+             if (reg_equal)
+               rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0),
+                                                  nonzero_bits_mode);
+             else
+               rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode);
+           }
          num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x));
          if (rsp->sign_bit_copies == 0
              || rsp->sign_bit_copies > num)



Reply via email to