https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110724
Bug ID: 110724 Summary: Unnecessary alignment on branch to unconditional branch targets Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: javier.martinez.bugzilla at gmail dot com Target Milestone: --- https://godbolt.org/z/f7qMxxfMj void duff(int * __restrict to, const int * __restrict from, const int count) { int n = (count+7) / 8; switch(count%8) { case 0: do { *to++ = *from++; case 7: *to++ = *from++; case 6: *to++ = *from++; case 5: *to++ = *from++; case 4: *to++ = *from++; case 3: *to++ = *from++; case 2: *to++ = *from++; [[likely]] case 1: *to++ = *from++; } while (--n>0); } } Trunk with O3: jle .L1 [...] lea rax, [rax+4] jmp .L5 # <-- no fall-through to ret .p2align 4,,7 # <-- unnecessary alignment .p2align 3 .L1: ret I believe this 16-byte alignment is done to put the branch target at the beginning of a front-end instruction fetch block. That however seems unnecessary when the branch target is itself an unconditional branch, as the instructions to follow will not retire. In this example the degrade is code size / instruction caching only, as there is no possible fall-through to .L1 that would cause nop's to be consumed. Changing the C++ attribute to [[unlikely]] introduces fall-through, and GCC seems to remove the padding, which is great.