On Thu, 2022-11-17 at 10:55 +0800, Jinyang He wrote: > On 2022/11/17 上午9:39, Jinyang He wrote: > > > On 2022/11/16 下午7:46, Xi Ruoyao wrote: > > > > > On Wed, 2022-11-16 at 10:11 +0800, Jinyang He wrote: > > > > > > > > > + return "%G6\\n\\t" > > > > > > + "1:\\n\\t" > > > > > > + "ll.<amo>\\t%0,%1\\n\\t" > > > > > > + "and\\t%7,%0,%z3\\n\\t" > > > > > > + "or%i5\\t%7,%7,%5\\n\\t" > > > > > > + "sc.<amo>\\t%7,%1\\n\\t" > > > > > > + "beqz\\t%7,1b\\n\\t"; > > > > > Do we need a "dbar 0x700" after beqz? > > > > > > > > > > /* snip */ > > > > That's worth discussing. Actually I don't see any dbar hint definition > > > > like 0x700 in the manual right now. > > > > Besides, I think what should be provided here is a relaxed version. And > > > > whether the barrier exsit or not is depend on the specific > > > > memory_order. > > > It's not related to memory order, but for a hardware issue workaround. > > > Jiaxun told me (via LKML): > > > > > > I had checked with Loongson guys and they confirmed that the > > > workaround still needs to be applied to latest 3A4000 processors, > > > including 3A4000 for MIPS and 3A5000 for LoongArch. > > > Though, the reason behind the workaround varies with the > > > evaluation > > > of their uArch, for GS464V based core, barrier is required as the > > > uArch design allows regular load to be reordered after an atomic > > > linked load, and that would break assumption of compiler atomic > > > constraints. > > > > That certainly seems to be needed, but before or after. It's beyond my > > recognition and cc huang...@loongson.cn for help. > > > Pei told me the ll-sc works at present like follows, > > uArch like: > ll -> (ll.dbar ll.ld_atomic) > sc -> (sc.dbar sc.st_atomic) > > exchange: > ll.dbar > <---------------------------+ > ll.ld_atomic $rd | > ...(no jmp) | > sc.dbar | > sc.st_stomic $rd | > ld $rj -can-not-emit-at-----+ > > The load $rj can not emit between ll.dbar and ll.ld_atomic because the > sc.dbar barrier it. > > > compare and exchange: > ll.dbar > <-----------------------+ > ll.ld_atomic $rd | > ...(jmp) ---------------+------+ > sc.dbar | | > sc.st_stomic $rd | | > | <--+ > ld $rj -may-emit-at-----+ > > Jumping out ll-sc may lead loading $rj emit between ll.dbar and ll.atomic. > > > Thus, exchange not need dbar. > > > > > > > > > > > > Without these dbar instructions I'd got random test failures in GCC > > > libgomp test suite. > > Which test suite?
I mean when we didn't use dbar 0x700 for compare-and-exchange (during the early development stage of GCC for LoongArch) I observed these failures. So we do need an additional dbar for compare-and-exchange, but do not need it for a bare atomic exchange? -- Xi Ruoyao <xry...@xry111.site> School of Aerospace Science and Technology, Xidian University