https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83784
Bug ID: 83784 Summary: Missed optimization with bitfield Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: daniel.santos at pobox dot com Target Milestone: --- Created attachment 43095 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43095&action=edit test case The layout of bitfields in memory is, of course, undefined in the C standard and is implementation-dependent. But when I happen to guess how gcc will lay it out correctly, I would like for these pack and unpack functions to compile-out. I'm only doing this because I happen to need to be able to know what 32-bit portion of a 64-bit value has one of the fields (for futex operations) and bitfields are syntactically easier to work with. But due to this flaw, I have to go back to shifting, ANDing, ORing, etc. The attached test case is probably not as simple as it could be as I'm testing both 32 and 64-bit code on x86, but the below is probably a descent summary (for 64-bits): union u { unsigned long ulong_val; struct { unsigned long a:4; unsigned long b:60; }; }; union u pack(union u in) { union u ret; ret.ulong_val |= in.b; ret.ulong_val <<= 4; ret.ulong_val |= in.a; return ret; } The above pack function compiles into the no-op I would expect: pack: .LFB12: .cfi_startproc movq %rdi, %rax ret .cfi_endproc But if I use three bitfields, my pack function is no longer a no-op: union u { unsigned long ulong_val; struct { unsigned long a:4; unsigned long b:30; unsigned long c:30; }; }; union u pack( union u in ) { union u ret; ret.ulong_val = in.c; ret.ulong_val <<= 30; ret.ulong_val |= in.b; ret.ulong_val <<= 4; ret.ulong_val |= in.a; return ret; } And here's the output (with hex immediates for ANDs) pack: pack: .LFB11: .cfi_startproc movq %rdi, %rax movq %rdi, %rdx andl $0xf, %edi shrq $34, %rax shrq $4, %rdx salq $30, %rax andl $0x3fff, %edx orq %rdx, %rax salq $4, %rax orq %rdi, %rax ret .cfi_endproc Possibly related to bug #15596 and maybe even a duplicate of bug #35363, but I'm uncertain. I have only tested on gcc 5.4.0 and 8 from git so far and only x86, but I'm going to *guess* this is a tree-optimization issue and not the x86 backend.