My question is: where and how would you suggest we do this optimization. With peephole2? Or in combine? In i386.md, I see pattern *subsi_2 looks like what I'd like to combine these two insn into:
(define_insn "*subsi_2" [(set (reg FLAGS_REG) (compare (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,0") (match_operand:SI 2 "general_operand" "ri,rm")) (const_int 0))) (set (match_operand:SI 0 "nonimmediate_operand" "=rm,r") (minus:SI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) && ix86_binary_operator_ok (MINUS, SImode, operands)" "sub{l}\t{%2, %0|%0, %2}" [(set_attr "type" "alu") (set_attr "mode" "SI")]) But I do not see a peephole2 that would generate this insn. Does anyone know how this pattern is used? This is the sort of thing you'd expect combine.c to make when it combines an scc insn with arithmetic when the arithmetic result is also used.