Hi, The patch tries to use REG_EQUAL to get more precise info for nonzero_bits, which helps to remove unnecessary zero_extend.
Here is an example when compiling Coremark, we have rtx like, (insn 1244 386 388 47 (set (reg:SI 263 [ D.5767 ]) (reg:SI 384 [ D.5767 ])) 786 {*thumb2_movsi_insn} (expr_list:REG_EQUAL (zero_extend:SI (mem:QI (reg/v/f:SI 271 [ memblock ]) [0 *memblock_13(D)+0 S1 A8])) (nil))) from "reg:SI 384", we can only know it is a 32-bit value. But from REG_EQUAL, we can know it is an 8-bit value. Then for the following rtx seq, (insn 409 407 410 50 (set (reg:SI 308) (plus:SI (reg:SI 263 [ D.5767 ]) (const_int -48 [0xffffffffffffffd0]))) core_state.c:170 4 {*arm_addsi3} (nil)) (insn 410 409 411 50 (set (reg:SI 309) (zero_extend:SI (subreg:QI (reg:SI 308) 0))) core_state.c:170 812 {thumb2_zero_extendqisi2_v6} (expr_list:REG_DEAD (reg:SI 308) (nil))) the zero_extend for r309 can be optimized by combine pass. Bootstrap and no make check regression on X86-64. No make check regression on Cortex-M4 qemu. No Spec2K INT regression on X86-64 and Cortex-A15 with -O3. Coremark on Cortex-M7 is 0.3% better. Coremark on Cortex-M4 is 0.07% regression due to alignment change. No Coremark change on Corter-M0 and Cortex-A15. Unfortunately I failed to generate a meaningful small case for it. So no test case is included in the patch. Ok for trunk? Thanks! -Zhenqiang ChangeLog: 2014-11-21 Zhenqiang Chen <zhenqiang.c...@arm.com> * combine.c (set_nonzero_bits_and_sign_copies): Try REG_EQUAL note. diff --git a/gcc/combine.c b/gcc/combine.c index 6a7d16b..68a883b 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx set, void *data) /* Don't call nonzero_bits if it cannot change anything. */ if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) - rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); + { + rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL, NULL_RTX) + : NULL_RTX; + if (reg_equal) + rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0), + nonzero_bits_mode); + else + rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); + } num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x)); if (rsp->sign_bit_copies == 0 || rsp->sign_bit_copies > num)