https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

            Bug ID: 96932
           Summary: [nvptx] atomic_exchange missing barrier
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

After digging into GOMP_atomic_start/end I realized these also imply barrier
semantics.

And looking at the source code used for nvptx in libgomp/config/accel/mutex.h,
that should be fine:
...
static inline void
gomp_mutex_lock (gomp_mutex_t *mutex)
{
  while (__sync_lock_test_and_set (mutex, 1))
    /* spin */ ;
}

static inline void
gomp_mutex_unlock (gomp_mutex_t *mutex)
{
  __sync_lock_release (mutex);
}
...

However, when looking at the resulting code in libgomp.a we see there's no
barrier for GOMP_atomic_start:
...
.visible .func GOMP_atomic_start
{
.reg .u32 %r22;
.reg .pred %r23;
$L2:
.loc 1 51 10
atom.global.exch.b32 %r22,[atomic_lock],1;
.loc 1 51 9
setp.ne.u32 %r23,%r22,0;
@ %r23 bra $L2;
.loc 2 43 1
ret;
}
...

While there is for GOMP_atomic_end:
...
.visible .func GOMP_atomic_end
{
.reg .u32 %r22;
.loc 1 58 3
membar.sys;
mov.u32 %r22,0;
st.global.u32 [atomic_lock],%r22;
.loc 2 49 1
ret;
}
...

Reply via email to