https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83784

            Bug ID: 83784
           Summary: Missed optimization with bitfield
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

Created attachment 43095
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43095&action=edit
test case

The layout of bitfields in memory is, of course, undefined in the C standard
and is implementation-dependent.  But when I happen to guess how gcc will lay
it out correctly, I would like for these pack and unpack functions to
compile-out.  I'm only doing this because I happen to need to be able to know
what 32-bit portion of a 64-bit value has one of the fields (for futex
operations) and bitfields are syntactically easier to work with.  But due to
this flaw, I have to go back to shifting, ANDing, ORing, etc.

The attached test case is probably not as simple as it could be as I'm testing
both 32 and 64-bit code on x86, but the below is probably a descent summary
(for 64-bits):

union u
{
    unsigned long ulong_val;
    struct {
        unsigned long a:4;
        unsigned long b:60;
    };
};

union u pack(union u in)
{
    union u ret;
    ret.ulong_val  |= in.b;
    ret.ulong_val <<= 4;
    ret.ulong_val  |= in.a;
    return ret;
}

The above pack function compiles into the no-op I would expect:
pack:
.LFB12:
        .cfi_startproc
        movq    %rdi, %rax
        ret
        .cfi_endproc


But if I use three bitfields, my pack function is no longer a no-op:

union u
{
    unsigned long ulong_val;
    struct {
        unsigned long a:4;
        unsigned long b:30;
        unsigned long c:30;
    };
};

union u pack( union u in )
{
    union u ret;
    ret.ulong_val   = in.c;
    ret.ulong_val <<= 30;
    ret.ulong_val  |= in.b;
    ret.ulong_val <<= 4;
    ret.ulong_val  |= in.a;
    return ret;
}

And here's the output (with hex immediates for ANDs)
pack:
pack:
.LFB11:
        .cfi_startproc
        movq    %rdi, %rax
        movq    %rdi, %rdx
        andl    $0xf, %edi
        shrq    $34, %rax
        shrq    $4, %rdx
        salq    $30, %rax
        andl    $0x3fff, %edx
        orq     %rdx, %rax
        salq    $4, %rax
        orq     %rdi, %rax
        ret
        .cfi_endproc


Possibly related to bug #15596 and maybe even a duplicate of bug #35363, but
I'm uncertain.  I have only tested on gcc 5.4.0 and 8 from git so far and only
x86, but I'm going to *guess* this is a tree-optimization issue and not the x86
backend.

Reply via email to