https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91131

--- Comment #9 from Per Dalgas Jakobsen <pdj at knaldgas dot dk> ---
(In reply to Richard Biener from comment #8)
> Fixed on trunk sofar.
> 
> Note the non-optimal code-gen probably was a side-effect of us making
> three volatile accesses out of one.  On x86 I now see
> 
> main:
> .LFB0:
>         .cfi_startproc
>         movl    $0, Reg_A(%rip)
>         xorl    %eax, %eax
>         movl    $8, Reg_B(%rip)
>         movl    $255, Reg_C(%rip)
>         movb    $0, Reg_D(%rip)
>         movb    $-1, Reg_E(%rip)
>         ret

That's as efficient as it gets :)

On AVR architecture I now get this:
avr-gcc -O0:
        andi    r25, 0xF8       ; 248
        andi    r25, 0xF7       ; 247
        andi    r25, 0x0F       ; 15
        sts     0x0064, r25     ; 0x800064 <Reg_A>
        andi    r24, 0xF8       ; 248
        ori     r24, 0x08       ; 8
        andi    r24, 0x0F       ; 15
        sts     0x0065, r24     ; 0x800065 <Reg_B>
        lds     r24, 0x0060     ; 0x800060 <__data_start>
        sts     0x0063, r24     ; 0x800063 <Reg_C>
        sts     0x0066, r1      ; 0x800066 <Reg_D>
        ldi     r24, 0xFF       ; 255
        sts     0x0062, r24     ; 0x800062 <__data_end>

avr-gcc -O3 (and -Os, -O1, -O2):
        sts     0x0064, r1      ; 0x800064 <Reg_A>
        ldi     r24, 0x08       ; 8
        sts     0x0065, r24     ; 0x800065 <Reg_B>
        lds     r24, 0x0060     ; 0x800060 <__data_start>
        sts     0x0063, r24     ; 0x800063 <Reg_C>
        sts     0x0066, r1      ; 0x800066 <Reg_D>
        ldi     r24, 0xFF       ; 255
        sts     0x0062, r24     ; 0x800062 <__data_end>

Nice improvement, but Reg_C is still loading from memory. Is it possible to get
that into an immediate as well?

I'm slightly surprised that -O0 shows the setting of individual fields. It's
certainly not a bug, perhaps not even an issue, and absolutely something I can
live with :)


> as of using a packed structure the reason it might be problematic is that
> this lowers its alignment to 1 byte.  There are architectures that cannot
> do unaligned accesses so when the bitfield spans more than one byte the
> access might need to be decomposed.  Using an aligned attribute in addition
> to the packed attribute and aligning the structure appropriately would be
> a solution to this issue.

Ah, got it, thanks!

Reply via email to