On 2022/11/17 上午11:38, Xi Ruoyao wrote:

On Thu, 2022-11-17 at 10:55 +0800, Jinyang He wrote:
On 2022/11/17 上午9:39, Jinyang He wrote:

On 2022/11/16 下午7:46, Xi Ruoyao wrote:

On Wed, 2022-11-16 at 10:11 +0800, Jinyang He wrote:

+  return "%G6\\n\\t"
+        "1:\\n\\t"
+        "ll.<amo>\\t%0,%1\\n\\t"
+        "and\\t%7,%0,%z3\\n\\t"
+        "or%i5\\t%7,%7,%5\\n\\t"
+        "sc.<amo>\\t%7,%1\\n\\t"
+        "beqz\\t%7,1b\\n\\t";
Do we need a "dbar 0x700" after beqz?

/* snip */
That's worth discussing. Actually I don't see any dbar hint definition
like 0x700 in the manual right now.
Besides, I think what should be provided here is a relaxed version. And
whether the barrier exsit or not is depend on the specific
memory_order.
It's not related to memory order, but for a hardware issue workaround.
Jiaxun told me (via LKML):

     I had checked with Loongson guys and they confirmed that the
     workaround still needs to be applied to latest 3A4000 processors,
     including 3A4000 for MIPS and 3A5000 for LoongArch.
         Though, the reason behind the workaround varies with the
evaluation
     of their uArch, for GS464V based core, barrier is required as the
     uArch design allows regular load to be reordered after an atomic
     linked load, and that would break assumption of compiler atomic
     constraints.
That certainly seems to be needed, but before or after. It's beyond my
recognition and cc huang...@loongson.cn for help.

Pei told me the ll-sc works at present like follows,

uArch like:
    ll -> (ll.dbar ll.ld_atomic)
    sc -> (sc.dbar sc.st_atomic)

exchange:
ll.dbar
<---------------------------+
ll.ld_atomic $rd            |
...(no jmp)                 |
sc.dbar                     |
sc.st_stomic $rd            |
ld $rj -can-not-emit-at-----+

The load $rj can not emit between ll.dbar and ll.ld_atomic because the
sc.dbar barrier it.


compare and exchange:
ll.dbar
<-----------------------+
ll.ld_atomic $rd        |
...(jmp) ---------------+------+
sc.dbar                 |      |
sc.st_stomic $rd        |      |
                          |   <--+
ld $rj -may-emit-at-----+

Jumping out ll-sc may lead loading $rj emit between ll.dbar and ll.atomic.


Thus, exchange not need dbar.



Without these dbar instructions I'd got random test failures in GCC
libgomp test suite.
Which test suite?
I mean when we didn't use dbar 0x700 for compare-and-exchange (during
the early development stage of GCC for LoongArch) I observed these
failures.

So we do need an additional dbar for compare-and-exchange, but do not
need it for a bare atomic exchange?
Yes.

Reply via email to