On Sat, Feb 8, 2025 at 9:40 AM Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > The following testcase is miscompiled because of RTL represententation > of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it > actually does. > Let's look e.g. at > (define_insn_and_split "*jcc_bt<mode>" > [(set (pc) > (if_then_else (match_operator 0 "bt_comparison_operator" > [(zero_extract:SWI48 > (match_operand:SWI48 1 "nonimmediate_operand") > (const_int 1) > (match_operand:QI 2 "nonmemory_operand")) > (const_int 0)]) > (label_ref (match_operand 3)) > (pc))) > (clobber (reg:CC FLAGS_REG))] > "(TARGET_USE_BT || optimize_function_for_size_p (cfun)) > && (CONST_INT_P (operands[2]) > ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode) > && INTVAL (operands[2]) > >= (optimize_function_for_size_p (cfun) ? 8 : 32)) > : !memory_operand (operands[1], <MODE>mode)) > && ix86_pre_reload_split ()" > "#" > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > (zero_extract:SWI48 > (match_dup 1) > (const_int 1) > (match_dup 2)) > (const_int 0))) > (set (pc) > (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) > (label_ref (match_dup 3)) > (pc)))] > { > operands[0] = shallow_copy_rtx (operands[0]); > PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); > }) > The define_insn part in RTL describes exactly what it does, > jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ). > The problem is with what it splits into. > put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU, > nc for NE and GEU and ICEs otherwise. > CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb, > in those cases e.g. for add we have > (set (reg:CCC flags) (compare:CCC (plus:M x y) x)) > and use (ltu (reg:CCC flags) (const_int 0)) for carry set and > (geu (reg:CCC flags) (const_int 0)) for carry not set. These cases > model in RTL what is actually happening, compare in infinite precision > x from the result of finite precision addition in M mode and if it is > less than unsigned (i.e. overflow happened), carry is set. > Another use of CCCmode is in UNSPEC_* patterns, those are used with > (eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset, > given the UNSPEC no big deal, the middle-end doesn't know what means > set or unset. > But for the bt{l,q}; j{c,nc} case the above splits it into > (set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0))) > for bt and > (set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc))) > for the bit set case (so that the jump expands to jc) and ne for > the bit not set case (so that the jump expands to jnc). > Similarly for the different splitters for cmov and set{c,nc} etc. > The problem is that when the middle-end reads this RTL, it feels > the exact opposite to it. If zero_extract is 1, flags is set > to comparison of 1 and 0 and that would mean using ne ne in the > if_then_else, and vice versa. > > So, in order to better describe in RTL what is actually happening, > one possibility would be to swap the behavior of put_condition_code > and use NE + LTU -> c and EQ + GEU -> nc rather than the current > EQ + LTU -> c and NE + GEU -> nc; and adjust everything. The > following patch uses a more limited approach, instead of representing > bt{l,q}; j{c,nc} case as written above it uses > (set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract))) > and > (set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc))) > which uses the existing put_condition_code but describes what the > insns actually do in RTL clearly. If zero_extract is 1, > then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU, > 0U >= 0U. The patch adjusts the *bt<mode> define_insn and all the > splitters to it and its comparisons/conditional moves/setXX. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2025-02-08 Jakub Jelinek <ja...@redhat.com> > > PR target/118623 > * config/i386/i386.md (*bt<mode>): Represent bt as > compare:CCC of const0_rtx and zero_extract rather than > zero_extract and const0_rtx. > (*bt<SWI48:mode>_mask): Likewise. > (*jcc_bt<mode>): Likewise. Use LTU and GEU as flags test > instead of EQ and NE. > (*jcc_bt<mode>_mask): Likewise. > (*jcc_bt<SWI48:mode>_mask_1): Likewise. > (Help combine recognize bt followed by cmov splitter): Likewise. > (*bt<mode>_setcqi): Likewise. > (*bt<mode>_setncqi): Likewise. > (*bt<mode>_setnc<mode>): Likewise. > (*bt<mode>_setncqi_2): Likewise. > (*bt<mode>_setc<mode>_mask): Likewise. > > * gcc.c-torture/execute/pr118623.c: New test.
OK, but please note that generated comparison is not in its canonical form: "For commutative and comparison operators, a constant is always made the second operand." AFAICS, this should not be a problem, but perhaps some optimizations can be suppressed by this. Uros. > > --- gcc/config/i386/i386.md.jj 2025-02-07 22:11:49.444941171 +0100 > +++ gcc/config/i386/i386.md 2025-02-07 22:32:19.640943663 +0100 > @@ -19124,11 +19124,11 @@ (define_peephole2 > (define_insn "*bt<mode>" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > + (const_int 0) > (zero_extract:SWI48 > (match_operand:SWI48 0 "nonimmediate_operand" "r,m") > (const_int 1) > - (match_operand:QI 1 "nonmemory_operand" "q<S>,<S>")) > - (const_int 0)))] > + (match_operand:QI 1 "nonmemory_operand" "q<S>,<S>"))))] > "" > { > switch (get_attr_mode (insn)) > @@ -19155,14 +19155,14 @@ (define_insn "*bt<mode>" > (define_insn_and_split "*bt<SWI48:mode>_mask" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > + (const_int 0) > (zero_extract:SWI48 > (match_operand:SWI48 0 "nonimmediate_operand" "r,m") > (const_int 1) > (subreg:QI > (and:SWI248 > (match_operand:SWI248 1 "register_operand") > - (match_operand 2 "const_int_operand")) 0)) > - (const_int 0)))] > + (match_operand 2 "const_int_operand")) 0))))] > "TARGET_USE_BT > && (INTVAL (operands[2]) & (GET_MODE_BITSIZE (<SWI48:MODE>mode)-1)) > == GET_MODE_BITSIZE (<SWI48:MODE>mode)-1 > @@ -19171,8 +19171,8 @@ (define_insn_and_split "*bt<SWI48:mode>_ > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 0) (const_int 1) (match_dup 1)) > - (const_int 0)))] > + (const_int 0) > + (zero_extract:SWI48 (match_dup 0) (const_int 1) (match_dup 1))))] > "operands[1] = gen_lowpart (QImode, operands[1]);") > > (define_insn_and_split "*jcc_bt<mode>" > @@ -19197,18 +19197,18 @@ (define_insn_and_split "*jcc_bt<mode>" > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > + (const_int 0) > (zero_extract:SWI48 > (match_dup 1) > (const_int 1) > - (match_dup 2)) > - (const_int 0))) > + (match_dup 2)))) > (set (pc) > (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) > (label_ref (match_dup 3)) > (pc)))] > { > operands[0] = shallow_copy_rtx (operands[0]); > - PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); > + PUT_CODE (operands[0], GET_CODE (operands[0]) == NE ? LTU : GEU); > }) > > ;; Avoid useless masking of bit offset operand. > @@ -19233,18 +19233,18 @@ (define_insn_and_split "*jcc_bt<mode>_ma > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > + (const_int 0) > (zero_extract:SWI48 > (match_dup 1) > (const_int 1) > - (match_dup 2)) > - (const_int 0))) > + (match_dup 2)))) > (set (pc) > (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) > (label_ref (match_dup 4)) > (pc)))] > { > operands[0] = shallow_copy_rtx (operands[0]); > - PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); > + PUT_CODE (operands[0], GET_CODE (operands[0]) == NE ? LTU : GEU); > }) > > ;; Avoid useless masking of bit offset operand. > @@ -19270,18 +19270,18 @@ (define_insn_and_split "*jcc_bt<SWI48:mo > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > + (const_int 0) > (zero_extract:SWI48 > (match_dup 1) > (const_int 1) > - (match_dup 2)) > - (const_int 0))) > + (match_dup 2)))) > (set (pc) > (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) > (label_ref (match_dup 4)) > (pc)))] > { > operands[0] = shallow_copy_rtx (operands[0]); > - PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); > + PUT_CODE (operands[0], GET_CODE (operands[0]) == NE ? LTU : GEU); > operands[2] = gen_lowpart (QImode, operands[2]); > }) > > @@ -19302,10 +19302,10 @@ (define_split > && ix86_pre_reload_split ()" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > - (const_int 0))) > + (const_int 0) > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)))) > (set (match_dup 0) > - (if_then_else:SWI248 (eq (reg:CCC FLAGS_REG) (const_int 0)) > + (if_then_else:SWI248 (ltu (reg:CCC FLAGS_REG) (const_int 0)) > (match_dup 3) > (match_dup 4)))] > { > @@ -19326,10 +19326,10 @@ (define_insn_and_split "*bt<mode>_setcqi > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > - (const_int 0))) > + (const_int 0) > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)))) > (set (match_dup 0) > - (eq:QI (reg:CCC FLAGS_REG) (const_int 0)))]) > + (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))]) > > ;; Help combine recognize bt followed by setnc > (define_insn_and_split "*bt<mode>_setncqi" > @@ -19346,10 +19346,10 @@ (define_insn_and_split "*bt<mode>_setncq > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > - (const_int 0))) > + (const_int 0) > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)))) > (set (match_dup 0) > - (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]) > + (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))]) > > (define_insn_and_split "*bt<mode>_setnc<mode>" > [(set (match_operand:SWI48 0 "register_operand") > @@ -19364,10 +19364,10 @@ (define_insn_and_split "*bt<mode>_setnc< > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > - (const_int 0))) > + (const_int 0) > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)))) > (set (match_dup 3) > - (ne:QI (reg:CCC FLAGS_REG) (const_int 0))) > + (geu:QI (reg:CCC FLAGS_REG) (const_int 0))) > (set (match_dup 0) (zero_extend:SWI48 (match_dup 3)))] > "operands[3] = gen_reg_rtx (QImode);") > > @@ -19386,10 +19386,10 @@ (define_insn_and_split "*bt<mode>_setncq > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > - (const_int 0))) > + (const_int 0) > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)))) > (set (match_dup 0) > - (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]) > + (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))]) > > ;; Help combine recognize bt followed by setc > (define_insn_and_split "*bt<mode>_setc<mode>_mask" > @@ -19410,10 +19410,10 @@ (define_insn_and_split "*bt<mode>_setc<m > "&& 1" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > - (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > - (const_int 0))) > + (const_int 0) > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)))) > (set (match_dup 3) > - (eq:QI (reg:CCC FLAGS_REG) (const_int 0))) > + (ltu:QI (reg:CCC FLAGS_REG) (const_int 0))) > (set (match_dup 0) (zero_extend:SWI48 (match_dup 3)))] > { > operands[2] = gen_lowpart (QImode, operands[2]); > --- gcc/testsuite/gcc.c-torture/execute/pr118623.c.jj 2025-02-07 > 18:49:31.228308386 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr118623.c 2025-02-07 > 18:45:36.100551949 +0100 > @@ -0,0 +1,23 @@ > +/* PR target/118623 */ > + > +static int > +foo (int x, int y) > +{ > + int a = 1 << x; > + if (y & a) > + return 0; > + return 5; > +} > + > +__attribute__((noipa)) void > +bar (int x) > +{ > + if (((foo (x - 50, x) + x + x) & 1) == 0) > + __builtin_abort (); > +} > + > +int > +main () > +{ > + bar (63); > +} > > Jakub >