On Tue, Aug 5, 2025 at 1:32 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Sandiford <richard.sandif...@arm.com> writes:
> > "H.J. Lu" <hjl.to...@gmail.com> writes:
> >> On Mon, Aug 4, 2025 at 3:28 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >>>
> >>> On Mon, Aug 4, 2025 at 2:04 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >>> >
> >>> > On Mon, Aug 4, 2025 at 8:50 AM Richard Sandiford
> >>> > <richard.sandif...@arm.com> wrote:
> >>> > > Sorry, I hadn't realised that there were still unfixed regressions
> >>> > > from that patch.  I suppose if we wanted to avoid two patterns here,
> >>> > > we'd need to extend the pre-existing word_mode folds to support
> >>> > > subword modes too (for !WORD_REGISTER_OPERATIONS).  The attached
> >>> > > untested patch does that, but I expect it would have similar
> >>> > > knock-on effects.  I'll give it a spin overnight on x86 anyway
> >>> > > just to see what happens.
> >>> >
> >>> > Yes, it fixes:
> >>> >
> >>> > FAIL: gcc.target/i386/pr82524.c scan-assembler-not mov[sz]bl
> >>> > FAIL: gcc.target/i386/pr82524.c scan-assembler [ \t]notb
> >>> >
> >>> > together with the enclosed patch.
> >>>
> >>> 64-bit libgo failed to compile
> >>>
> >>> during RTL pass: late_combine
> >>> /export/gnu/import/git/gitlab/x86-gcc/libgo/go/runtime/mpallocbits.go:
> >>> In function ‘runtime.pageBits.clear’:
> >>> /export/gnu/import/git/gitlab/x86-gcc/libgo/go/runtime/mpallocbits.go:67:1:
> >>> internal compiler error: in simplify_subreg, at simplify-rtx.cc:8085
> >>>    67 | func (b *pageBits) clear(i uint) {
> >>>       | ^
> >>> 0x2276abf internal_error(char const*, ...)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/diagnostic-global-context.cc:534
> >>> 0x669387 fancy_abort(char const*, int, char const*)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/diagnostics/context.cc:1640
> >>> 0x4d0a70 simplify_context::simplify_subreg(machine_mode, rtx_def*,
> >>> machine_mode, poly_int<1u, unsigned long>)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/simplify-rtx.cc:8085
> >>> 0xdbc955 simplify_context::simplify_subreg(machine_mode, rtx_def*,
> >>> machine_mode, poly_int<1u, unsigned long>)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/simplify-rtx.cc:8349
> >>> 0xd38441 insn_propagation::apply_to_rvalue_1(rtx_def**)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/recog.cc:1224
> >>> 0xd37e8b insn_propagation::apply_to_rvalue_1(rtx_def**)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/recog.cc:1166
> >>> 0xd38942 insn_propagation::apply_to_pattern_1(rtx_def**)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/recog.cc:1400
> >>> 0xd389ef insn_propagation::apply_to_pattern(rtx_def**)
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/recog.cc:1444
> >>> 0x20c441d substitute_nondebug_use
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/late-combine.cc:198
> >>> 0x20c441d substitute_nondebug_uses
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/late-combine.cc:271
> >>> 0x20c5b5d run
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/late-combine.cc:440
> >>> 0x20c5b5d combine_into_uses
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/late-combine.cc:690
> >>> 0x20c669c execute
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/late-combine.cc:718
> >>> 0x20c669c execute
> >>> /export/gnu/import/git/gitlab/x86-gcc/gcc/late-combine.cc:771
> >>> Please submit a full bug report, with preprocessed source (by using
> >>> -freport-bug).
> >>> Please include the complete backtrace with any bug report.
> >>> See <https://gcc.gnu.org/bugs/> for instructions.
> >>>
> >>> (gdb) call debug (op)
> >>> (and:SI (reg:DI 113 [ i ])
> >>>     (const_int 63 [0x3f]))
> >>> (gdb)
> >
> > That's not a valid rtx though, so...
> >
> >> The enclosed patch works.
> >
> > ...I think this is masking a bug elsewhere.
> >
> > Specifically:
> >
> >> +  /* Attempt to simplify WORD_MODE and sub-WORD_MODE SUBREGs of bitwise
> >> +     expressions.  */
> >> +  scalar_int_mode int_outermode;
> >> +  if (is_a<scalar_int_mode> (outermode, &int_outermode)
> >> +      && (WORD_REGISTER_OPERATIONS
> >> +      ? int_outermode == word_mode
> >> +      : GET_MODE_PRECISION (int_outermode) <= BITS_PER_WORD)
> >> +      && SCALAR_INT_MODE_P (innermode)
> >> +      && (GET_CODE (op) == IOR
> >> +      || GET_CODE (op) == XOR
> >> +      || GET_CODE (op) == AND
> >> +      || GET_CODE (op) == NOT))
> >>      {
> >> -      rtx op0 = simplify_subreg (outermode, XEXP (op, 0), innermode, 
> >> byte);
> >> -      if (op0)
> >> -    return simplify_gen_unary (GET_CODE (op), outermode, op0, outermode);
> >> +      rtx op0 = XEXP (op, 0);
> >> +      if (GET_MODE (op0) != innermode)
> >
> > This condition must be true if op0 isn't a constant, for the op codes
> > tested above.
> >
> > I'll try to reproduce.
>
> It's coming from:
>
> (define_split
>   [(set (match_operand:SWI 0 "register_operand")
>         (any_rotate:SWI
>           (match_operand:SWI 1 "const_int_operand")
>           (subreg:QI
>             (and
>               (match_operand 2 "int248_register_operand")
>               (match_operand 3 "const_int_operand")) 0)))]
>  "(INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode) - 1))
>    == GET_MODE_BITSIZE (<MODE>mode) - 1"
>  [(set (match_dup 4) (match_dup 1))
>   (set (match_dup 0)
>        (any_rotate:SWI (match_dup 4)
>                        (subreg:QI
>                          (and:SI (match_dup 2) (match_dup 3)) 0)))]
>  "operands[4] = gen_reg_rtx (<MODE>mode);")
>
> which matches any mode of (and ...) on input, but hard-codes (and:SI ...)
> in the output.  This causes an ICE if the incoming (and ...) is DImode
> rather than SImode.
>
> The patch below seems to fix it.

I am testing a slightly adjusted patch:

--cut here--
i386: Fix invalid RTX mode in the unnamed rotate splitter.

The following splitter from the commit r11-5747:

(define_split
  [(set (match_operand:SWI 0 "register_operand")
        (any_rotate:SWI
          (match_operand:SWI 1 "const_int_operand")
          (subreg:QI
            (and
              (match_operand 2 "int248_register_operand")
              (match_operand 3 "const_int_operand")) 0)))]
 "(INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode) - 1))
   == GET_MODE_BITSIZE (<MODE>mode) - 1"
 [(set (match_dup 4) (match_dup 1))
  (set (match_dup 0)
       (any_rotate:SWI (match_dup 4)
                       (subreg:QI
                         (and:SI (match_dup 2) (match_dup 3)) 0)))]
 "operands[4] = gen_reg_rtx (<MODE>mode);")

matches any mode of (and ...) on input, but hard-codes (and:SI ...)
in the output.  This causes an ICE if the incoming (and ...) is DImode
rather than SImode.

Co-developed-by: Richard Sandiford <richard.sandif...@arm.com>

    PR target/96226

gcc/ChangeLog:

    * config/i386/predicates.md (and_operator): New operator.
    * config/i386/i386.md (splitter after *<rotate_insn><mode>3_mask):
    Use and_operator to match AND RTX and use its mode
    in the split pattern.
--cut here--

that fixes the mentioned issue from PR96226.

I plan to commit this patch in a couple of hours, so IMO the way will
be cleared for Richard's simplify-rtx.cc patch to land.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2b0dd66c68b..6686f1070f9 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18298,17 +18298,17 @@ (define_split
        (any_rotate:SWI
          (match_operand:SWI 1 "const_int_operand")
          (subreg:QI
-           (and
-             (match_operand 2 "int248_register_operand")
-             (match_operand 3 "const_int_operand")) 0)))]
+           (match_operator 4 "and_operator"
+             [(match_operand 2 "int248_register_operand")
+              (match_operand 3 "const_int_operand")]) 0)))]
  "(INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode) - 1))
    == GET_MODE_BITSIZE (<MODE>mode) - 1"
- [(set (match_dup 4) (match_dup 1))
+ [(set (match_dup 5) (match_dup 1))
   (set (match_dup 0)
-       (any_rotate:SWI (match_dup 4)
+       (any_rotate:SWI (match_dup 5)
                       (subreg:QI
-                        (and:SI (match_dup 2) (match_dup 3)) 0)))]
- "operands[4] = gen_reg_rtx (<MODE>mode);")
+                        (match_op_dup 4 [(match_dup 2) (match_dup 3)]) 0)))]
+ "operands[5] = gen_reg_rtx (<MODE>mode);")
 
 (define_insn_and_split "*<insn><mode>3_mask_1"
   [(set (match_operand:SWI 0 "nonimmediate_operand")
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 0f310902e7b..175798cff69 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1714,10 +1714,14 @@ (define_predicate "mult_operator"
 (define_predicate "div_operator"
   (match_code "div"))
 
-;; Return true if this is a and, ior or xor operation.
+;; Return true if this is an and, ior or xor operation.
 (define_predicate "logic_operator"
   (match_code "and,ior,xor"))
 
+;; Return true if this is an and operation.
+(define_predicate "and_operator"
+  (match_code "and"))
+
 ;; Return true if this is a plus, minus, and, ior or xor operation.
 (define_predicate "plusminuslogic_operator"
   (match_code "plus,minus,and,ior,xor"))

Reply via email to