On 27.09.22 11:23, Tobias Burnus wrote:

We do support
 #if __PTX_SM__ >= 600  (CUDA >= 8.0, ptx isa >= 5.0)
and we also can configure GCC with
 --with-arch=sm_70 (or sm_80 or ...)
Thus, adding atomics with .sys scope is possible.

See attached patch. This seems to work fine and I hope I got the
assembly right in terms of atomic use. (And I do believe that the
.release/.acquire do not need an additional __sync_syncronize()/"membar.sys".)

Regarding this:

While 'atom.op' (op = and/or/xor/cas/exch/add/inc/dec/min/max)
with scope is a sm_60 feature, the used 'st/ld' with scope qualifier
and .relaxed, .release / .relaxed, .acquire require sm_70.

(Does not really matter as only ..., sm_53 and sm_70, ... is currently
supported but not sm_60, but the #if should be obviously fixed.)

* * *

Looking at the generated code for without inline assembler, we have instead of
 st.global.release.sys.u64 [%r27],%r39;
and
 ld.acquire.sys.global.u64 %r62,[%r27];
for the older-systems (__PTX_SM < 700) the code:
 @ %r69 membar.sys;
 @ %r69 atom.exch.b64 _,[%r27],%r41;
and
 ld.global.u64 %r64,[__gomp_rev_offload_var];
 ld.u64 %r36,[%r64];
 membar.sys;

In my understanding, the membar.sys ensures - similar to
 st.release / ld.acquire
that the memory handling is done in the correct order in scope .sys.
As the 'fn' variable is initially 0 - and then only set via the device
i.e. there is eventually a DMA write device->host, which is atomically
as the will int64_t is written at once (and not first, e.g. the lower
and then the upper half). The 'st'/'atom.exch' should work fine, despite
having no .sys scope.

Likewise, the membar.sys applies also in the other direction. Or did I
miss something. If so, would an explicit __sync_synchronize() (= membar.sys)
help between the 'st' and the 'ld'?

Tobias


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Reply via email to