On Thu, 2022-11-17 at 10:55 +0800, Jinyang He wrote:
> On 2022/11/17 上午9:39, Jinyang He wrote:
> 
> > On 2022/11/16 下午7:46, Xi Ruoyao wrote:
> > 
> > > On Wed, 2022-11-16 at 10:11 +0800, Jinyang He wrote:
> > > 
> > > > > > +  return "%G6\\n\\t"
> > > > > > +        "1:\\n\\t"
> > > > > > +        "ll.<amo>\\t%0,%1\\n\\t"
> > > > > > +        "and\\t%7,%0,%z3\\n\\t"
> > > > > > +        "or%i5\\t%7,%7,%5\\n\\t"
> > > > > > +        "sc.<amo>\\t%7,%1\\n\\t"
> > > > > > +        "beqz\\t%7,1b\\n\\t";
> > > > > Do we need a "dbar 0x700" after beqz?
> > > > > 
> > > > > /* snip */
> > > > That's worth discussing. Actually I don't see any dbar hint definition
> > > > like 0x700 in the manual right now.
> > > > Besides, I think what should be provided here is a relaxed version. And
> > > > whether the barrier exsit or not is depend on the specific 
> > > > memory_order.
> > > It's not related to memory order, but for a hardware issue workaround.
> > > Jiaxun told me (via LKML):
> > > 
> > >     I had checked with Loongson guys and they confirmed that the
> > >     workaround still needs to be applied to latest 3A4000 processors,
> > >     including 3A4000 for MIPS and 3A5000 for LoongArch.
> > >         Though, the reason behind the workaround varies with the 
> > > evaluation
> > >     of their uArch, for GS464V based core, barrier is required as the
> > >     uArch design allows regular load to be reordered after an atomic
> > >     linked load, and that would break assumption of compiler atomic
> > >     constraints.
> > 
> > That certainly seems to be needed, but before or after. It's beyond my
> > recognition and cc huang...@loongson.cn for help.
> 
> 
> Pei told me the ll-sc works at present like follows,
> 
> uArch like:
>    ll -> (ll.dbar ll.ld_atomic)
>    sc -> (sc.dbar sc.st_atomic)
> 
> exchange:
> ll.dbar
> <---------------------------+
> ll.ld_atomic $rd            |
> ...(no jmp)                 |
> sc.dbar                     |
> sc.st_stomic $rd            |
> ld $rj -can-not-emit-at-----+
> 
> The load $rj can not emit between ll.dbar and ll.ld_atomic because the
> sc.dbar barrier it.
> 
> 
> compare and exchange:
> ll.dbar
> <-----------------------+
> ll.ld_atomic $rd        |
> ...(jmp) ---------------+------+
> sc.dbar                 |      |
> sc.st_stomic $rd        |      |
>                          |   <--+
> ld $rj -may-emit-at-----+
> 
> Jumping out ll-sc may lead loading $rj emit between ll.dbar and ll.atomic.
> 
> 
> Thus, exchange not need dbar.
> 
> 
> > 
> > 
> > > 
> > > Without these dbar instructions I'd got random test failures in GCC
> > > libgomp test suite.
> 
> Which test suite?

I mean when we didn't use dbar 0x700 for compare-and-exchange (during
the early development stage of GCC for LoongArch) I observed these
failures.

So we do need an additional dbar for compare-and-exchange, but do not
need it for a bare atomic exchange?

-- 
Xi Ruoyao <xry...@xry111.site>
School of Aerospace Science and Technology, Xidian University

Reply via email to