Honnappa?

07/10/2020 11:55, Diogo Behrens:
> Hi Thomas,
> 
> we are still waiting for the comments from Honnappa. In our understanding, 
> the missing barrier is a bug according to the model. We reproduced the 
> scenario in herd7, which represents the authoritative memory model: 
> https://developer.arm.com/architectures/cpu-architecture/a-profile/memory-model-tool
> 
> Here is a litmus code that shows that the XCHG (when compiled to LDAXR and 
> STLR) is not atomic wrt memory updates to other locations:
> -----
> AArch64 XCHG-nonatomic
> {
> 0:X1=locked; 0:X3=next;
> 1:X1=locked; 1:X3=next; 1:X5=tail;
> }
>  P0           | P1;
>  LDR W0, [X3] | MOV W0, #1;
>  CBZ W0, end  | STR W0, [X1]; (* init locked *) 
>  MOV W2, #2   | MOV W2, #0;
>  STR W2, [X1] | xchg:;
>  end:         | LDAXR W6, [X5];
>  NOP          | STLXR W4, W0, [X5];
>  NOP          | CBNZ W4, xchg;
>  NOP          | STR W0, [X3]; (* set next *) 
> exists
> (0:X2=2 /\ locked=1)
> -----
> (web version of herd7: http://diy.inria.fr/www/?record=aarch64)
> 
> P1 is trying to acquire the lock:
> - initializes locked
> - does the xchg on the tail of the mcslock
> - sets the next
> 
> P0 is releasing the lock:
> - if next is not set, just terminates
> - if next is set, stores 2 in locked
> 
> The initialization of locked should never overwrite the store 2 to locked, 
> but it does.
> To avoid that reordering to happen, one should make the last store of P1 to 
> have a "release" barrier, ie, STLR.
> 
> This is equivalent to the reordering occurring in the mcslock of librte_eal.
> 
> Best regards,
> -Diogo
> 
> -----Original Message-----
> From: Thomas Monjalon [mailto:tho...@monjalon.net] 
> Sent: Tuesday, October 6, 2020 11:50 PM
> To: Phil Yang <phil.y...@arm.com>; Diogo Behrens <diogo.behr...@huawei.com>; 
> Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> Cc: dev@dpdk.org; nd <n...@arm.com>
> Subject: Re: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory
> 
> 31/08/2020 20:45, Honnappa Nagarahalli:
> > 
> > Hi Diogo,
> > 
> > Thanks for your explanation.
> > 
> > As documented in https://developer.arm.com/documentation/ddi0487/fc  B2.9.5 
> > Load-Exclusive and Store-Exclusive instruction usage restrictions:
> > " Between the Load-Exclusive and the Store-Exclusive, there are no 
> > explicit memory accesses, preloads, direct or indirect System register 
> > writes, address translation instructions, cache or TLB maintenance 
> > instructions, exception generating instructions, exception returns, or 
> > indirect branches."
> > [Honnappa] This is a requirement on the software, not on the 
> > micro-architecture.
> > We are having few discussions internally, will get back soon.
> > 
> > So it is not allowed to insert (1) & (4) between (2, 3). The cmpxchg 
> > operation is atomic.
> 
> 
> Please what is the conclusion?



Reply via email to