On Fri, 2015-08-14 at 15:13 +0800, Kevin Hao wrote: > On Thu, Aug 13, 2015 at 10:39:19PM -0500, Scott Wood wrote: > > On Thu, 2015-08-13 at 19:51 +0800, Kevin Hao wrote: > > > I didn't find anything unusual. But I think we do need to order the > > > load/store of esel_next when acquire/release tcd lock. For acquire, > > > add a data dependency to order the loads of lock and esel_next. > > > For release, even there already have a "isync" here, but it doesn't > > > guarantee any memory access order. So we still need "lwsync" for > > > the two stores for lock and esel_next. > > > > I was going to say that esel_next is just a hint and it doesn't really > > matter > > if we occasionally get the wrong value, unless it happens often enough to > > cause more performance degradation than the lwsync causes. However, with > > the > > A-008139 workaround we do need to read the same value from esel_next both > > times. It might be less costly to save/restore an additional register > > instead of lwsync, though. > > I will try to get some benchmark number to compare which method is a bit > better. > Do you have any recommended benchmark for a case this is?
lmbench lat_mem_rd with a stride chosen to maximize TLB misses. For the uncontended case, one instance; for the contended case, two instances, one pinned to each thread of a core. -Scott _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev