https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94037
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org, | |uros at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- The setge %sil movzbl %sil, %esi to xorl %esi, %esi setge %sil transformation is something GCC does too with the ;; Convert setcc + movzbl to xor + setcc if operands don't overlap. peephole2s. The reason it doesn't trigger here is that the single comparison has 3 uses (why not 2 and something didn't CSE those?). So, at least for the two setcc quite long after cmp case, we might need to either not have the comparison in the peephole2 and instead walk backwards, looking for the FLAGS_REG setter (or stopping on something that clobbers it) and stopping on if the register we'd want to xor first is mentioned by any insn in between. We have at the start of peephole2: (insn 37 36 38 5 (set (reg:CCGC 17 flags) (compare:CCGC (reg/v:SI 1 dx [orig:86 pivot ] [86]) (reg:SI 0 ax [orig:242 pretmp_404 ] [242]))) "pr94037.C":6:19 11 {*cmpsi_1} (expr_list:REG_DEAD (reg:SI 0 ax [orig:242 pretmp_404 ] [242]) (nil))) (note 38 37 39 5 NOTE_INSN_DELETED) (note 39 38 1178 5 NOTE_INSN_DELETED) (insn 1178 39 1179 5 (set (reg:QI 2 cx [288]) (lt:QI (reg:CCGC 17 flags) (const_int 0 [0]))) "pr94037.C":6:21 732 {*setcc_qi} (nil)) (insn 1179 1178 41 5 (set (reg:DI 2 cx [288]) (zero_extend:DI (reg:QI 2 cx [288]))) "pr94037.C":6:21 115 {zero_extendqidi2} (nil)) (insn 41 1179 42 5 (set (reg:SI 2 cx [289]) (mem:SI (plus:DI (plus:DI (mult:DI (reg:DI 2 cx [288]) (const_int 4 [0x4])) (reg/f:DI 7 sp)) (const_int 120 [0x78])) [1 MEM[(int[2] *)_213] S4 A32])) "pr94037.C":6:15 67 {*movsi_internal} (expr_list:REG_EQUIV (mem:SI (reg/v/f:DI 36 r8 [orig:283 begin ] [283]) [1 *begin_329+0 S4 A32]) (nil))) (insn 42 41 45 5 (set (mem:SI (reg/v/f:DI 36 r8 [orig:283 begin ] [283]) [1 *begin_329+0 S4 A32]) (reg:SI 2 cx [289])) "pr94037.C":6:15 67 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 2 cx [289]) (nil))) (note 45 42 1180 5 NOTE_INSN_DELETED) (insn 1180 45 1181 5 (set (reg:QI 0 ax [290]) (ge:QI (reg:CCGC 17 flags) (const_int 0 [0]))) "pr94037.C":15:10 732 {*setcc_qi} (expr_list:REG_DEAD (reg:CCGC 17 flags) (nil))) (insn 1181 1180 47 5 (set (reg:DI 0 ax [290]) (zero_extend:DI (reg:QI 0 ax [290]))) "pr94037.C":15:10 115 {zero_extendqidi2} (nil)) and current peephole2 manages to handle the first setcc+movzbl. The second one actually isn't doable because of the RA decisions, as the comparison uses ax and we'd need to clear rax before the comparison. Similarly, in partition we have 3 setcc+movzbls, but the first one has the movzbl not adjacent to the setcc, second one would be doable if the peephole2 pattern didn't include the FLAGS_REG setter and walked back and the third one isn't doable because the comparison uses cx, i.e. the register that's set by setcc and extended by movzbl. Now, on the GIMPLE level, we have e.g. in partition: _2 = MEM[base: right_35, offset: 0B]; _3 = _2 <= pivot_12; _4 = (int) _3; _19 = *left_31; _7 = {_19, _2}; MEM <vector(2) int> [(int *)&v] = _7; _21 = v[_4]; *left_31 = _21; _22 = _2 > pivot_12; _23 = (int) _22; _24 = v[_23]; MEM[base: right_35, offset: 0] = _24; v ={v} {CLOBBER}; _5 = _2 <= pivot_12 ? 4 : 0; so I wonder why sccvn has not at least replaced the last _2 <= pivot_12 with _3. And, if SCCVN could be taught that if we have _3 = _2 <= pivot_12 and later _22 = _2 > pivot_12, we can simplify the latter to _22 = 1 - _3; 1 -