On Sat, Feb 8, 2025 at 9:40 AM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> The following testcase is miscompiled because of RTL represententation
> of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it
> actually does.
> Let's look e.g. at
> (define_insn_and_split "*jcc_bt<mode>"
>   [(set (pc)
>         (if_then_else (match_operator 0 "bt_comparison_operator"
>                         [(zero_extract:SWI48
>                            (match_operand:SWI48 1 "nonimmediate_operand")
>                            (const_int 1)
>                            (match_operand:QI 2 "nonmemory_operand"))
>                          (const_int 0)])
>                       (label_ref (match_operand 3))
>                       (pc)))
>    (clobber (reg:CC FLAGS_REG))]
>   "(TARGET_USE_BT || optimize_function_for_size_p (cfun))
>    && (CONST_INT_P (operands[2])
>        ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)
>           && INTVAL (operands[2])
>                >= (optimize_function_for_size_p (cfun) ? 8 : 32))
>        : !memory_operand (operands[1], <MODE>mode))
>    && ix86_pre_reload_split ()"
>   "#"
>   "&& 1"
>   [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
>           (zero_extract:SWI48
>             (match_dup 1)
>             (const_int 1)
>             (match_dup 2))
>           (const_int 0)))
>    (set (pc)
>         (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
>                       (label_ref (match_dup 3))
>                       (pc)))]
> {
>   operands[0] = shallow_copy_rtx (operands[0]);
>   PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
> })
> The define_insn part in RTL describes exactly what it does,
> jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ).
> The problem is with what it splits into.
> put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU,
> nc for NE and GEU and ICEs otherwise.
> CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb,
> in those cases e.g. for add we have
> (set (reg:CCC flags) (compare:CCC (plus:M x y) x))
> and use (ltu (reg:CCC flags) (const_int 0)) for carry set and
> (geu (reg:CCC flags) (const_int 0)) for carry not set.  These cases
> model in RTL what is actually happening, compare in infinite precision
> x from the result of finite precision addition in M mode and if it is
> less than unsigned (i.e. overflow happened), carry is set.
> Another use of CCCmode is in UNSPEC_* patterns, those are used with
> (eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset,
> given the UNSPEC no big deal, the middle-end doesn't know what means
> set or unset.
> But for the bt{l,q}; j{c,nc} case the above splits it into
> (set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0)))
> for bt and
> (set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
> for the bit set case (so that the jump expands to jc) and ne for
> the bit not set case (so that the jump expands to jnc).
> Similarly for the different splitters for cmov and set{c,nc} etc.
> The problem is that when the middle-end reads this RTL, it feels
> the exact opposite to it.  If zero_extract is 1, flags is set
> to comparison of 1 and 0 and that would mean using ne ne in the
> if_then_else, and vice versa.
>
> So, in order to better describe in RTL what is actually happening,
> one possibility would be to swap the behavior of put_condition_code
> and use NE + LTU -> c and EQ + GEU -> nc rather than the current
> EQ + LTU -> c and NE + GEU -> nc; and adjust everything.  The
> following patch uses a more limited approach, instead of representing
> bt{l,q}; j{c,nc} case as written above it uses
> (set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract)))
> and
> (set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
> which uses the existing put_condition_code but describes what the
> insns actually do in RTL clearly.  If zero_extract is 1,
> then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU,
> 0U >= 0U.  The patch adjusts the *bt<mode> define_insn and all the
> splitters to it and its comparisons/conditional moves/setXX.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2025-02-08  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/118623
>         * config/i386/i386.md (*bt<mode>): Represent bt as
>         compare:CCC of const0_rtx and zero_extract rather than
>         zero_extract and const0_rtx.
>         (*bt<SWI48:mode>_mask): Likewise.
>         (*jcc_bt<mode>): Likewise.  Use LTU and GEU as flags test
>         instead of EQ and NE.
>         (*jcc_bt<mode>_mask): Likewise.
>         (*jcc_bt<SWI48:mode>_mask_1): Likewise.
>         (Help combine recognize bt followed by cmov splitter): Likewise.
>         (*bt<mode>_setcqi): Likewise.
>         (*bt<mode>_setncqi): Likewise.
>         (*bt<mode>_setnc<mode>): Likewise.
>         (*bt<mode>_setncqi_2): Likewise.
>         (*bt<mode>_setc<mode>_mask): Likewise.
>
>         * gcc.c-torture/execute/pr118623.c: New test.

OK, but please note that generated comparison is not in its canonical form:

"For commutative and comparison operators, a constant is always made
the second operand."

AFAICS, this should not be a problem, but perhaps some optimizations
can be suppressed by this.

Uros.

>
> --- gcc/config/i386/i386.md.jj  2025-02-07 22:11:49.444941171 +0100
> +++ gcc/config/i386/i386.md     2025-02-07 22:32:19.640943663 +0100
> @@ -19124,11 +19124,11 @@ (define_peephole2
>  (define_insn "*bt<mode>"
>    [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
> +         (const_int 0)
>           (zero_extract:SWI48
>             (match_operand:SWI48 0 "nonimmediate_operand" "r,m")
>             (const_int 1)
> -           (match_operand:QI 1 "nonmemory_operand" "q<S>,<S>"))
> -         (const_int 0)))]
> +           (match_operand:QI 1 "nonmemory_operand" "q<S>,<S>"))))]
>    ""
>  {
>    switch (get_attr_mode (insn))
> @@ -19155,14 +19155,14 @@ (define_insn "*bt<mode>"
>  (define_insn_and_split "*bt<SWI48:mode>_mask"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> +         (const_int 0)
>            (zero_extract:SWI48
>              (match_operand:SWI48 0 "nonimmediate_operand" "r,m")
>              (const_int 1)
>             (subreg:QI
>               (and:SWI248
>                 (match_operand:SWI248 1 "register_operand")
> -               (match_operand 2 "const_int_operand")) 0))
> -          (const_int 0)))]
> +               (match_operand 2 "const_int_operand")) 0))))]
>    "TARGET_USE_BT
>     && (INTVAL (operands[2]) & (GET_MODE_BITSIZE (<SWI48:MODE>mode)-1))
>        == GET_MODE_BITSIZE (<SWI48:MODE>mode)-1
> @@ -19171,8 +19171,8 @@ (define_insn_and_split "*bt<SWI48:mode>_
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> -         (zero_extract:SWI48 (match_dup 0) (const_int 1) (match_dup 1))
> -         (const_int 0)))]
> +        (const_int 0)
> +        (zero_extract:SWI48 (match_dup 0) (const_int 1) (match_dup 1))))]
>    "operands[1] = gen_lowpart (QImode, operands[1]);")
>
>  (define_insn_and_split "*jcc_bt<mode>"
> @@ -19197,18 +19197,18 @@ (define_insn_and_split "*jcc_bt<mode>"
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
> +         (const_int 0)
>           (zero_extract:SWI48
>             (match_dup 1)
>             (const_int 1)
> -           (match_dup 2))
> -         (const_int 0)))
> +           (match_dup 2))))
>     (set (pc)
>         (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
>                       (label_ref (match_dup 3))
>                       (pc)))]
>  {
>    operands[0] = shallow_copy_rtx (operands[0]);
> -  PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
> +  PUT_CODE (operands[0], GET_CODE (operands[0]) == NE ? LTU : GEU);
>  })
>
>  ;; Avoid useless masking of bit offset operand.
> @@ -19233,18 +19233,18 @@ (define_insn_and_split "*jcc_bt<mode>_ma
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
> +         (const_int 0)
>           (zero_extract:SWI48
>             (match_dup 1)
>             (const_int 1)
> -           (match_dup 2))
> -         (const_int 0)))
> +           (match_dup 2))))
>     (set (pc)
>         (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
>                       (label_ref (match_dup 4))
>                       (pc)))]
>  {
>    operands[0] = shallow_copy_rtx (operands[0]);
> -  PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
> +  PUT_CODE (operands[0], GET_CODE (operands[0]) == NE ? LTU : GEU);
>  })
>
>  ;; Avoid useless masking of bit offset operand.
> @@ -19270,18 +19270,18 @@ (define_insn_and_split "*jcc_bt<SWI48:mo
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
> +         (const_int 0)
>           (zero_extract:SWI48
>             (match_dup 1)
>             (const_int 1)
> -           (match_dup 2))
> -         (const_int 0)))
> +           (match_dup 2))))
>     (set (pc)
>         (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
>                       (label_ref (match_dup 4))
>                       (pc)))]
>  {
>    operands[0] = shallow_copy_rtx (operands[0]);
> -  PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
> +  PUT_CODE (operands[0], GET_CODE (operands[0]) == NE ? LTU : GEU);
>    operands[2] = gen_lowpart (QImode, operands[2]);
>  })
>
> @@ -19302,10 +19302,10 @@ (define_split
>     && ix86_pre_reload_split ()"
>    [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
> -        (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> -        (const_int 0)))
> +        (const_int 0)
> +        (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))))
>     (set (match_dup 0)
> -       (if_then_else:SWI248 (eq (reg:CCC FLAGS_REG) (const_int 0))
> +       (if_then_else:SWI248 (ltu (reg:CCC FLAGS_REG) (const_int 0))
>                              (match_dup 3)
>                              (match_dup 4)))]
>  {
> @@ -19326,10 +19326,10 @@ (define_insn_and_split "*bt<mode>_setcqi
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> -         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> -         (const_int 0)))
> +        (const_int 0)
> +        (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))))
>     (set (match_dup 0)
> -        (eq:QI (reg:CCC FLAGS_REG) (const_int 0)))])
> +       (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))])
>
>  ;; Help combine recognize bt followed by setnc
>  (define_insn_and_split "*bt<mode>_setncqi"
> @@ -19346,10 +19346,10 @@ (define_insn_and_split "*bt<mode>_setncq
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> -         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> -         (const_int 0)))
> +        (const_int 0)
> +        (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))))
>     (set (match_dup 0)
> -        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))])
> +       (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))])
>
>  (define_insn_and_split "*bt<mode>_setnc<mode>"
>    [(set (match_operand:SWI48 0 "register_operand")
> @@ -19364,10 +19364,10 @@ (define_insn_and_split "*bt<mode>_setnc<
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> -         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> -         (const_int 0)))
> +        (const_int 0)
> +        (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))))
>     (set (match_dup 3)
> -        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))
> +       (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))
>     (set (match_dup 0) (zero_extend:SWI48 (match_dup 3)))]
>    "operands[3] = gen_reg_rtx (QImode);")
>
> @@ -19386,10 +19386,10 @@ (define_insn_and_split "*bt<mode>_setncq
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> -         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> -         (const_int 0)))
> +         (const_int 0)
> +         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))))
>     (set (match_dup 0)
> -        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))])
> +       (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))])
>
>  ;; Help combine recognize bt followed by setc
>  (define_insn_and_split "*bt<mode>_setc<mode>_mask"
> @@ -19410,10 +19410,10 @@ (define_insn_and_split "*bt<mode>_setc<m
>    "&& 1"
>    [(set (reg:CCC FLAGS_REG)
>          (compare:CCC
> -         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> -         (const_int 0)))
> +         (const_int 0)
> +         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))))
>     (set (match_dup 3)
> -        (eq:QI (reg:CCC FLAGS_REG) (const_int 0)))
> +       (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))
>     (set (match_dup 0) (zero_extend:SWI48 (match_dup 3)))]
>  {
>    operands[2] = gen_lowpart (QImode, operands[2]);
> --- gcc/testsuite/gcc.c-torture/execute/pr118623.c.jj   2025-02-07 
> 18:49:31.228308386 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr118623.c      2025-02-07 
> 18:45:36.100551949 +0100
> @@ -0,0 +1,23 @@
> +/* PR target/118623 */
> +
> +static int
> +foo (int x, int y)
> +{
> +  int a = 1 << x;
> +  if (y & a)
> +    return 0;
> +  return 5;
> +}
> +
> +__attribute__((noipa)) void
> +bar (int x)
> +{
> +  if (((foo (x - 50, x) + x + x) & 1) == 0)
> +    __builtin_abort ();
> +}
> +
> +int
> +main ()
> +{
> +  bar (63);
> +}
>
>         Jakub
>

Reply via email to