https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
Uroš Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Jonathan Wakely from comment #0) > These two implementations of C++26 saturating addition > (std::add_sat<unsigned>) have equivalent behaviour: > > unsigned > add_sat(unsigned x, unsigned y) noexcept > { > unsigned z; > if (!__builtin_add_overflow(x, y, &z)) > return z; > return -1u; > } [...] > For -O3 on x86_64 GCC uses a branch for the first one: > > add_sat(unsigned int, unsigned int): > add edi, esi > jc .L3 > mov eax, edi > ret > .L3: > or eax, -1 > ret The reason for failed if-conversion to cmove is due to the "weird" compare arguments, the consequence of addsi3_cc_overflow_1 definition: (insn 9 4 10 2 (parallel [ (set (reg:CCC 17 flags) (compare:CCC (plus:SI (reg:SI 106) (reg:SI 107)) (reg:SI 106))) (set (reg:SI 104) (plus:SI (reg:SI 106) (reg:SI 107))) ]) "sadd.c":7:12 477 {addsi3_cc_overflow_1} (expr_list:REG_DEAD (reg:SI 107) (expr_list:REG_DEAD (reg:SI 106) (nil)))) the noce_try_cmove path fails in noce_emit_cmove: Breakpoint 1, noce_emit_cmove (if_info=0x7fffffffd750, x=0x7fffe9fe4e40, code=LTU, cmp_a=0x7fffe9fe4a20, cmp_b=0x7fffe9feb9a8, vfalse=0x7fffe9fe49d8, vtrue=0x7fffe9e09480, cc_cmp=0x0, rev_cc_cmp=0x0) at ../../git/gcc/gcc/ifcvt.cc:1774 1774 return NULL_RTX; (gdb) list 1766 /* Don't even try if the comparison operands are weird 1767 except that the target supports cbranchcc4. */ 1768 if (! general_operand (cmp_a, GET_MODE (cmp_a)) 1769 || ! general_operand (cmp_b, GET_MODE (cmp_b))) 1770 { 1771 if (!have_cbranchcc4 1772 || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC 1773 || cmp_b != const0_rtx) 1774 return NULL_RTX; 1775 } 1776 1777 target = emit_conditional_move (x, { code, cmp_a, cmp_b, VOIDmode }, 1778 vtrue, vfalse, GET_MODE (x), (gdb) bt #0 noce_emit_cmove (if_info=0x7fffffffd750, x=0x7fffe9fe4e40, code=LTU, cmp_a=0x7fffe9fe4a20, cmp_b=0x7fffe9feb9a8, vfalse=0x7fffe9fe49d8, vtrue=0x7fffe9e09480, cc_cmp=0x0, rev_cc_cmp=0x0) at ../../git/gcc/gcc/ifcvt.cc:1774 #1 0x00000000020d995b in noce_try_cmove (if_info=0x7fffffffd750) at ../../git/gcc/gcc/ifcvt.cc:1884 #2 0x00000000020dec37 in noce_process_if_block (if_info=0x7fffffffd750) at ../../git/gcc/gcc/ifcvt.cc:4149 #3 0x00000000020e0248 in noce_find_if_block (test_bb=0x7fffe9fb5d80, then_edge=0x7fffe9fd7cc0, else_edge=0x7fffe9fd7c60, pass=1) at ../../git/gcc/gcc/ifcvt.cc:4716 #4 0x00000000020e08e9 in find_if_header (test_bb=0x7fffe9fb5d80, pass=1) at ../../git/gcc/gcc/ifcvt.cc:4921 #5 0x00000000020e3255 in if_convert (after_combine=true) at ../../git/gcc/gcc/ifcvt.cc:6068 (gdb) p debug_rtx (cmp_a) (plus:SI (reg:SI 106) (reg:SI 107)) $1 = void (gdb) p debug_rtx (cmp_b) (reg:SI 106) $2 = void The above cmp_a RTX fails general_operand check. Please note that similar testcase: unsigned sub_sat(unsigned x, unsigned y) { unsigned z; return __builtin_sub_overflow(x, y, &z) ? 0 : z; } results in the expected: subl %esi, %edi # 52 [c=4 l=2] *subsi_3/0 movl $0, %eax # 53 [c=4 l=5] *movsi_internal/0 cmovnb %edi, %eax # 54 [c=4 l=3] *movsicc_noc/0 ret # 50 [c=0 l=1] simple_return_internal due to: (insn 9 4 10 2 (parallel [ (set (reg:CC 17 flags) (compare:CC (reg:SI 106) (reg:SI 107))) (set (reg:SI 104) (minus:SI (reg:SI 106) (reg:SI 107))) ]) "sadd.c":28:12 416 {*subsi_3} (expr_list:REG_DEAD (reg:SI 107) (expr_list:REG_DEAD (reg:SI 106) (nil)))) So, either addsi3_cc_overflow_1 RTX is not correct, or noce_emit_cmove should be improved to handle the above "weird" operand form. Let's ask Jakub.