Eric Botcazou <ebotca...@adacore.com> writes: >> The comment says that we're trying to match: >> >> 1. (set (reg:SI) (zero_extend:SI (plus:QI (mem:QI) (const_int)))) >> 2. (set (reg:QI) (plus:QI (mem:QI) (const_int))) >> 3. (set (reg:QI) (plus:QI (subreg:QI) (const_int))) >> 4. (set (reg:CC) (compare:CC (subreg:QI) (const_int))) >> 5. (set (reg:CC) (compare:CC (plus:QI (mem:QI) (const_int)))) >> 6. (set (reg:SI) (leu:SI (subreg:QI) (const_int))) >> 7. (set (reg:SI) (leu:SI (subreg:QI) (const_int))) >> 8. (set (reg:SI) (leu:SI (plus:QI ...))) >> >> And I think that's what we should be matching in cases where the >> extension isn't redundant, even on RISC targets. > > Which one(s) exactly? Most of the RISC targets we have are parameterized > (WORD_REGISTER_OPERATIONS, PROMOTE_MODE, etc) to avoid operations in modes > smaller than the word mode.
The first one, sorry. >> The problem here isn't really about which mode is on the plus, >> but whether we recognise that the extension instruction is redundant. >> I.e. we start with: >> >> (insn 9 8 10 2 (set (reg:SI 120) >> (plus:SI (subreg:SI (reg:QI 118) 0) >> (const_int -48 [0xffffffffffffffd0]))) test.c:6 -1 >> (nil)) >> (insn 10 9 11 2 (set (reg:SI 121) >> (and:SI (reg:SI 120) >> (const_int 255 [0xff]))) test.c:6 -1 >> (nil)) >> (insn 11 10 12 2 (set (reg:CC 100 cc) >> (compare:CC (reg:SI 121) >> (const_int 9 [0x9]))) test.c:6 -1 >> (nil)) >> >> and what we want combine to do is to recognise that insn 10 is redundant >> and reduce the sequence to: >> >> (insn 9 8 10 2 (set (reg:SI 120) >> (plus:SI (subreg:SI (reg:QI 118) 0) >> (const_int -48 [0xffffffffffffffd0]))) test.c:6 -1 >> (nil)) >> (insn 11 10 12 2 (set (reg:CC 100 cc) >> (compare:CC (reg:SI 120) >> (const_int 9 [0x9]))) test.c:6 -1 >> (nil)) >> >> But insn 11 is redundant on all targets, not just RISC ones. >> It isn't about whether the target has a QImode addition or not. > > That's theoritical though since, on x86 for example, the redundant > instruction > isn't even generated because of the QImode addition... Not for this testcase, sure, but we use an SImode addition and keep the equivalent redundant extension until combine for: int foo (unsigned char *x) { return (((unsigned int) *x - 48) & 0xff) < 10; } immediately before combine: (insn 7 6 8 2 (parallel [ (set (reg:SI 93 [ D.1753 ]) (plus:SI (reg:SI 92 [ D.1753 ]) (const_int -48 [0xffffffffffffffd0]))) (clobber (reg:CC 17 flags)) ]) /tmp/foo.c:3 261 {*addsi_1} (expr_list:REG_DEAD (reg:SI 92 [ D.1753 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) (insn 8 7 9 2 (set (reg:SI 94 [ D.1753 ]) (zero_extend:SI (subreg:QI (reg:SI 93 [ D.1753 ]) 0))) /tmp/foo.c:3 133 {*zero_extendqisi2} (expr_list:REG_DEAD (reg:SI 93 [ D.1753 ]) (nil))) (insn 9 8 10 2 (set (reg:CC 17 flags) (compare:CC (reg:SI 94 [ D.1753 ]) (const_int 9 [0x9]))) /tmp/foo.c:3 7 {*cmpsi_1} (expr_list:REG_DEAD (reg:SI 94 [ D.1753 ]) (nil))) What saves us isn't QImode addition but QImode comparison: combine: (insn 7 6 8 2 (parallel [ (set (reg:SI 93 [ D.1753 ]) (plus:SI (reg:SI 92 [ D.1753 ]) (const_int -48 [0xffffffffffffffd0]))) (clobber (reg:CC 17 flags)) ]) /tmp/foo.c:3 261 {*addsi_1} (expr_list:REG_DEAD (reg:SI 92 [ D.1753 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) (note 8 7 9 2 NOTE_INSN_DELETED) (insn 9 8 10 2 (set (reg:CC 17 flags) (compare:CC (subreg:QI (reg:SI 93 [ D.1753 ]) 0) (const_int 9 [0x9]))) /tmp/foo.c:3 5 {*cmpqi_1} (expr_list:REG_DEAD (reg:SI 93 [ D.1753 ]) (nil))) movzbl (%rdi), %eax subl $48, %eax cmpb $9, %al setbe %al movzbl %al, %eax ret (The patch didn't affect things here.) FWIW, change the testcase to: int foo (unsigned char *x) { return (((unsigned int) *x - 48) & 0x1ff) < 10; } and we keep the redundant AND, again regardless of whether the patch is applied. >> Well, I think making the simplify-rtx code conditional on the target >> would be the wrong way to go. If we really can't live with it being >> unconditional then I think we should revert it. But like I say I think >> it would be better to make combine recognise the redundancy even with >> the new form. (Or as I say, longer term, not to rely on combine to >> eliminate redundant extensions.) But I don't have time to do that myself... > > It helps x86 so we won't revert it. My fear is that we'll need to add > code in other places to RISCify back the result of this > "simplification". But that's the problem with trying to do the optimisation in this way. We first simplify a truncation of an SImode addition X. Then we simplify a zero extension of that truncation. Then we have the opportunity to realise that the zero extension wasn't necessary after all, so we actually want to undo both simplifications and go back to the original addition pattern. So undoing the simplifications is precisely what we're aiming for here, again regardless of RISCness. Thanks, Richard