https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87206

            Bug ID: 87206
           Summary: Suboptimal code generation for
                    __atomic_compare_exchange_n followed by a comparison
           Product: gcc
           Version: 8.2.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: iii at linux dot ibm.com
                CC: krebbel at gcc dot gnu.org
  Target Milestone: ---

I tried to build the example #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080 on x86_64 and observed a
similar issue:

$ cat 1.c
extern void bar (int *);

void foo5(int *mem)
{
  int oldval = 0;
  __atomic_compare_exchange_n (mem, (void *) &oldval, 1,
                               1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
  if (oldval != 0)
    bar (mem);
}

$ gcc-8 -c 1.c -O3 -g

$ objdump -d 1.o
# skip
0000000000000000 <_foo5>:
   0:   31 c0                   xor    %eax,%eax
   2:   ba 01 00 00 00          mov    $0x1,%edx
   7:   f0 0f b1 17             lock cmpxchg %edx,(%rdi)
   b:   85 c0                   test   %eax,%eax
   d:   75 01                   jne    10 <_foo5+0x10>
   f:   c3                      retq
  10:   e9 00 00 00 00          jmpq   15 <_foo5+0x15>

We don't have to do "test %eax,%eax", because this information is already
available through ZF, which is set by CMPXCHG.

I wonder if it would be possible to come up with a common solution for all
architectures, including x86_64 and s390?

Reply via email to