On 12/7/25 1:44 PM, Florian Weimer wrote:
* Jeff Law:

This is Shreya's work except for the SH testcase which I added after
realizing her work would also fix the testcases for that port.  I
bootstrapped and regression tested this on sh4-linux-gnu, x86_64 &
risc-v.  It also was tested across all the embedded targets in my tester
without regressions.

--


We are extracting two single-bit bitfields from a structure and
determining whether they both have the value 0 or if at least one bit is
set. This has been generating poor code:

  >         lw      a5,0(a0)
  >         bexti   a0,a5,1
  >         bexti   a5,a5,2
  >         or      a0,a0,a5
  >         ret

We address this as a simplification problem and optimize this using an
andi of the original value and a mask with just the desired bits set,
followed by a snez. This results in a 1 if any of those bits are set or
   0 if none.

For cases where we want to extract three or more single-bit bitfields,
we build on the previous case. We take the result of the 2-bitfield
case, extract the mask, update it to include the new single-bit
bitfield, and again perform an andi + snez.

In our new testfile, we scan to ensure we do not see a bexti or an or
instruction, and that we have the correct assembly for both two and
three single-bit bitfield cases: lw + andi + snez + ret.

We still have horrible code generation for this on x86-64.

 From the bug report:

[ ... ]
I know.  That's part of why the bug is staying open.




Is there no generic infrastructure that could handle this?
That was the hope of doing it in simplify-rtx since that's a common low level simplifier module. But's not necessarily sufficient.

match.pd isn't great for this as we'd need to see a single load covering the two fields within the structure, that's not trivially exposed until RTL.

And the actual formation in simplify-rtx can look different on different targets and needs to match something the x86 port defines. It looks like x86 needs a pattern like this;

Failed to match this instruction:
(parallel [
        (set (reg:QI 102 [ _5 ])
            (ne:QI (and:QI (reg:QI 106 [ *s_4(D) ])
                    (const_int 6 [0x6]))
                (const_int 0 [0])))
        (clobber (reg:CC 17 flags))
    ])

Jeff

Reply via email to