https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105661
Bug ID: 105661 Summary: Comparisons to atomic variables generates less efficient code Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: redbeard0531 at gmail dot com Target Milestone: --- With normal variables, gcc will generate a nice cmp/js pair when checking if the high bit is null. When using an atomic, gcc generates a movzx/test/js triple, even though it could use the same codegen as for a non-atomic. https://godbolt.org/z/GorvWfrsh #include <atomic> #include <cstdint> [[gnu::noinline]] void f(); uint8_t plain; std::atomic<uint8_t> atomic; void plain_test() { if (plain & 0x80) f(); } void atomic_test() { if (atomic.load(std::memory_order_relaxed) & 0x80) f(); } With both -O2 and -O3 this generates: plain_test(): cmp BYTE PTR plain[rip], 0 js .L4 ret .L4: jmp f() atomic_test(): movzx eax, BYTE PTR atomic[rip] test al, al js .L7 ret .L7: jmp f() ARM64 seems to be hit even harder, but I don't know that platform well enough to know if the non-atomic codegen is valid there https://godbolt.org/z/c3h8Y1dan. It seems likely though, at least for a relaxed load.