On Mon, Dec 8, 2025 at 1:34 AM Andrew Pinski <[email protected]> wrote:
>
>
>
> On Sun, Dec 7, 2025, 12:45 PM Florian Weimer <[email protected]> wrote:
>>
>> * Jeff Law:
>>
>> > This is Shreya's work except for the SH testcase which I added after
>> > realizing her work would also fix the testcases for that port.  I
>> > bootstrapped and regression tested this on sh4-linux-gnu, x86_64 &
>> > risc-v.  It also was tested across all the embedded targets in my tester
>> > without regressions.
>> >
>> > --
>> >
>> >
>> > We are extracting two single-bit bitfields from a structure and
>> > determining whether they both have the value 0 or if at least one bit is
>> > set. This has been generating poor code:
>> >
>> >  >         lw      a5,0(a0)
>> >  >         bexti   a0,a5,1
>> >  >         bexti   a5,a5,2
>> >  >         or      a0,a0,a5
>> >  >         ret
>> >
>> > We address this as a simplification problem and optimize this using an
>> > andi of the original value and a mask with just the desired bits set,
>> > followed by a snez. This results in a 1 if any of those bits are set or
>> >   0 if none.
>> >
>> > For cases where we want to extract three or more single-bit bitfields,
>> > we build on the previous case. We take the result of the 2-bitfield
>> > case, extract the mask, update it to include the new single-bit
>> > bitfield, and again perform an andi + snez.
>> >
>> > In our new testfile, we scan to ensure we do not see a bexti or an or
>> > instruction, and that we have the correct assembly for both two and
>> > three single-bit bitfield cases: lw + andi + snez + ret.
>>
>> We still have horrible code generation for this on x86-64.
>>
>> From the bug report:
>>
>> typedef struct
>> {
>>   _Bool a : 1;
>>   _Bool b : 1;
>>   _Bool c : 1;
>>   _Bool d : 1;
>>   unsigned int e : 4;
>> } S;
>>
>>
>> _Bool test_00 (S* s)
>> {
>>   return s->b | s->c;
>> }
>>
>> We get this:
>>
>> test_00:
>>         movzbl  (%rdi), %edx
>>         movl    %edx, %ecx
>>         movl    %edx, %eax
>>         shrb    %cl
>>         shrb    $2, %al
>>         orl     %ecx, %eax
>>         andl    $1, %eax
>>         ret
>>
>> This should be something like this instead:
>>
>> test_00:
>>         xorl    %eax, %eax
>>         testb   $6, (%rdi)
>>         setne   %al
>>         retq
>>
>> Is there no generic infrastructure that could handle this?
>
>
>
> The biggest way of fixing/improving the situation here would be have a 
> lowering pass for bitfields sometime later say before pre. I had worked on 
> one before but I never got back to working through the comments on the pass.
> Maybe next year I can have some time to mentor someone to implement it and 
> finish up the pass.

ifcombine now does most of these cases, moved from fold, but still not all.

Richard.

> Thanks,
> Andrew

Reply via email to