On 10/6/25 6:39 AM, Kishan Parmar wrote:

On 04/10/25 10:59 pm, Jeff Law wrote:

On 9/15/25 9:40 AM, Kishan Parmar wrote:
Hi All,

Changes from v1:
     * Corrected commit message.

For a given rtx expression (and (lshiftrt (subreg X) shift) mask)
combine pass tries to simplify the RTL form to

     (and (subreg (lshiftrt X shift)) mask)

where the SUBREG wraps the result of the shift.  This leaves the AND
and the shift in different modes, which complicates recognition.

     (and (lshiftrt (subreg X) shift) mask)

where the SUBREG is inside the shift and both operations share the same
mode.  This form is easier to recognize across targets and enables
cleaner pattern matching.

This patch checks in simplify-rtx to perform this transformation when
safe: the SUBREG must be a lowpart, the shift amount must be valid, and
the precision of the operation must be preserved.

Tested on powerpc64le-linux-gnu, powerpc64-linux-gnu, and
x86_64-pc-linux-gnu with no regressions.  On rs6000, the change reduces
insn counts due to improved matching.

2025-09-15  Kishan Parmar  <[email protected]>

gcc/ChangeLog:

     PR rtl-optimization/93738
     * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
     Canonicalize SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid.
This doesn't look correct to me.  I just don't see how we can safely push the 
subreg to the inner ops like you're trying to do in a generally safe manner.

Assume we have a narrowing lowpart subreg.  Say QImode.

And let's take the original RTL:

(and (subreg (lshift X shift)) mask)

In this case the subreg select the low 8 bits from the output of the lshift.  
Let's assume X is HImode and our shift count is 4.  Given an input value like

aaaa bbbb cccc dddd

We shift off the low 4 bits resulting in

0000 aaaa bbbb cccc

Then we apply the narrowing QI subreg resulting in

bbbb cccc


Let's do the same with your updated RTL

(and (lshift (subreg X) shift) mask)

aaaa bbbb cccc dddd

We apply the narrowing subreg resulting in

XXXX XXXX cccc dddd

Then shift 4 bits resulting in

0000 XXXX XXXX cccc

Note how we've lots the bits "bbbb" and replaced them with indeterminate values.

This can only be safe if the mask was going to wipe those 4 bit positions

Am I missing something here?

jeff
Hello Jeff,

For scenarios like above below condition fails and do not transform the rtx.

+             if (GET_CODE (op0) == SUBREG && subreg_lowpart_p (op0)
+                 && GET_CODE (XEXP (op0 ,0)) == LSHIFTRT
+                 && CONST_INT_P (XEXP (XEXP (op0 ,0), 1))
+                 && INTVAL (XEXP (XEXP (op0 ,0), 1)) >= 0
+                 && INTVAL (XEXP (XEXP (op0 ,0), 1)) < HOST_BITS_PER_WIDE_INT
+                 && ((INTVAL (XEXP (XEXP (op0, 0), 1))
+                     + floor_log2 (val1))
+                     < GET_MODE_PRECISION (as_a <scalar_int_mode> (mode))))
+               {

The transformation is done only when the AND mask is a compile-time constant
and "(msb(mask) + shift) < subreg mode precision", ensuring no loss of relevant 
bits.

For above case we have (8(mask) + 4(shift) < 8 (QI)), Reason being we don't 
apply transform here.
Clearly I misread this clause (repeatedly I might add).

+                 && GET_CODE (XEXP (op0 ,0)) == LSHIFTRT
Formatting it. It should be "XEXP (op0, 0)", the whitespace goes after the comma, not before. This is repeated in several of your expressions in that conditional.

+                 if (GET_CODE (tem) == SUBREG)
+                   {
+                     if (subreg_lowpart_p (tem))
+                       tem = SUBREG_REG (tem);
+                     else
+                       goto no_xform;
+                   }
Does this happen often enough to matter? Is there a way to restructure to avoid the goto? If a goto is really the cleanest form I would go with a generic name rather than something referring to a ppc concept (xform). Perhaps set tem to NULL_RTX is that else clause, then guard the code following the cause above with an outer if (tem)?

Jeff







Reply via email to