https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968
--- Comment #3 from Sven <sven.koehler at gmail dot com> --- I'm not familiar with GCC internals. So my following comments may be completely off. This has been classified as a "missed optimization". I would not expect the optimizer to change 4 ldrb into a single ldr. This seems like a code generation issue to me, not like an optimization issue. As can be seen with -O0, the aligned big endian access uses a single ldr but the unaligned big endian access uses the 4 ldrb. It seems gcc behaves like this: if (big_endian_access) { if (aligned_access) { issue "ldr" issue "rev" } else { issue 4 "ldrb" in big-endian order } } else { if (aligned_access || arch_supports_unaligned) { issue "ldr" } else { issue 4 "ldrb" in little-endian order } } When instead, gcc should behave like this: if (aligned_access || arch_supports_unaligned) { issue "ldr" if (big_endian_access) { issue "rev" } } else { if (big_endian_access) { issue 4 "ldrb" in big-endian order } else { issue 4 "ldrb" in little-endian order } }