https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hongyu Wang <hong...@gcc.gnu.org>:

https://gcc.gnu.org/g:0435b978f95971e139882549f5a1765c50682216

commit r12-7316-g0435b978f95971e139882549f5a1765c50682216
Author: Hongyu Wang <hongyu.w...@intel.com>
Date:   Fri Feb 11 14:44:15 2022 +0800

    i386: Relax cmpxchg instruction under -mrelax-cmpxchg-loop [PR103069]

    For cmpxchg, it is commonly used in spin loop, and several user code
    such as pthread directly takes cmpxchg as loop condition, which cause
    huge cache bouncing.

    This patch extends previous implementation to relax all cmpxchg
    instruction under -mrelax-cmpxchg-loop with an extra atomic load,
    compare and emulate the failed cmpxchg behavior.

    For original spin loop which looks like

    loop: mov    %eax,%r8d
          or     $1,%r8d
          lock cmpxchg %r8d,(%rdi)
          jne    loop

    It will now truns to

    loop: mov    %eax,%r8d
          or     $1,%r8d
          mov    (%r8),%rsi <--- load lock first
          cmp    %rsi,%rax <--- compare with expected input
          jne    .L2 <--- lock ne expected
          lock cmpxchg %r8d,(%rdi)
          jne    loop
      L2: mov    %rsi,%rax <--- perform the behavior of failed cmpxchg
          jne    loop

    under -mrelax-cmpxchg-loop.

    gcc/ChangeLog:

            PR target/103069
            * config/i386/i386-expand.cc (ix86_expand_atomic_fetch_op_loop):
            Split atomic fetch and loop part.
            (ix86_expand_cmpxchg_loop): New expander for cmpxchg loop.
            * config/i386/i386-protos.h (ix86_expand_cmpxchg_loop): New
            prototype.
            * config/i386/sync.md (atomic_compare_and_swap<mode>): Call new
            expander under TARGET_RELAX_CMPXCHG_LOOP.
            (atomic_compare_and_swap<mode>): Likewise for doubleword modes.

    gcc/testsuite/ChangeLog:

            PR target/103069
            * gcc.target/i386/pr103069-2.c: Adjust result check.
            * gcc.target/i386/pr103069-3.c: New test.
            * gcc.target/i386/pr103069-4.c: Likewise.

Reply via email to