On 11/23/2015 04:41 AM, Ling Ma wrote: > Hi Longman, > > Attachments include user space application thread.c and kernel patch > spinlock-test.patch based on kernel 4.3.0-rc4 > > we run thread.c with kernel patch, test original and new spinlock > respectively, > perf top -G indicates thread.c cause cache_alloc_refill and > cache_flusharray functions to spend ~25% time on original spinlock, > after introducing new spinlock in two functions, the cost time become ~22%. > > The printed data also tell us the new spinlock improves performance > by about 15%( 93841765576 / 81036259588) on E5-2699V3 > > Appreciate your comments. > >
I saw that you make the following changes in the code: static __always_inline void queued_spin_lock(struct qspinlock *lock) { u32 val; - +repeat: val = atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL); if (likely(val == 0)) return; - queued_spin_lock_slowpath(lock, val); + goto repeat; + //queued_spin_lock_slowpath(lock, val); } This effectively changes the queued spinlock into an unfair byte lock. Without a pause to moderate the cmpxchg() call, that is especially bad for performance. Is the performance data above refers to the unfair byte lock versus your new spinlock? Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/