On Thu, Apr 6, 2023 at 10:26 AM David Laight <david.lai...@aculab.com> wrote: > > From: Dave Hansen > > Sent: 05 April 2023 17:37 > > > > On 4/5/23 07:17, Uros Bizjak wrote: > > > Add generic and target specific support for local{,64}_try_cmpxchg > > > and wire up support for all targets that use local_t infrastructure. > > > > I feel like I'm missing some context. > > > > What are the actual end user visible effects of this series? Is there a > > measurable decrease in perf overhead? Why go to all this trouble for > > perf? Who else will use local_try_cmpxchg()? > > I'm assuming the local_xxx operations only have to be save wrt interrupts? > On x86 it is possible that an alternate instruction sequence > that doesn't use a locked instruction may actually be faster!
Please note that "local" functions do not use lock prefix. Only atomic properties of cmpxchg instruction are exploited since it only needs to be safe wrt interrupts. Uros. > Although, maybe, any kind of locked cmpxchg just needs to ensure > the cache line isn't 'stolen', so apart from possible slight > delays on another cpu that gets a cache miss for the line in > all makes little difference. > The cache line miss costs a lot anyway, line bouncing more > and is best avoided. > So is there actually much of a benefit at all? > > Clearly the try_cmpxchg help - but that is a different issue. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > 1PT, UK > Registration No: 1397386 (Wales)