https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105661

            Bug ID: 105661
           Summary: Comparisons to atomic variables generates less
                    efficient code
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

With normal variables, gcc will generate a nice cmp/js pair when checking if
the high bit is null. When using an atomic, gcc generates a movzx/test/js
triple, even though it could use the same codegen as for a non-atomic.

https://godbolt.org/z/GorvWfrsh

#include <atomic>
#include <cstdint>

[[gnu::noinline]] void f();
uint8_t plain;
std::atomic<uint8_t> atomic;

void plain_test() {
    if (plain & 0x80) f();
}

void atomic_test() {
    if (atomic.load(std::memory_order_relaxed) & 0x80) f();
}

With both -O2 and -O3 this generates:

plain_test():
        cmp     BYTE PTR plain[rip], 0
        js      .L4
        ret
.L4:
        jmp     f()
atomic_test():
        movzx   eax, BYTE PTR atomic[rip]
        test    al, al
        js      .L7
        ret
.L7:
        jmp     f()

ARM64 seems to be hit even harder, but I don't know that platform well enough
to know if the non-atomic codegen is valid there
https://godbolt.org/z/c3h8Y1dan. It seems likely though, at least for a relaxed
load.

Reply via email to