https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98982
Bug ID: 98982 Summary: Optimizing loop variants of fixed-byte-order functions Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: j...@jak-linux.org Target Milestone: --- These functions actually generate loops whereas they should just be optimized to no-ops. It's the classic byte-read/write idiom for little endian encoded integers, just written as a loop, to accomodate varying sizeof(T) in the template. template <typename T> struct little_endian { uint8_t data[sizeof(T)]; constexpr little_endian() : data{} { } constexpr little_endian(T in) { for (size_t i = 0; i < sizeof(T); i++) data[i] = (in >> (8 * i)) & 0xFF; } constexpr operator T() const { T res = 0; for (size_t i = 0; i < sizeof(T); i++) res |= static_cast<T>(data[i]) << (8u * i); return res; } }; -O3 or -funroll-loops correctly unrolls the encoding (the constructor), but the operator T() still ends up being ▸ endbr64 ▸ movabsq▸$72057594037927935, %rdx ▸ movq▸ %rdi, %rax ▸ shrq▸ $56, %rax ▸ andq▸ %rdi, %rdx ▸ salq▸ $56, %rax ▸ orq▸%rdx, %rax ▸ ret Seems weird to me.