[PATCH RFC 2/3] mutex: restrict mutex spinning to only one task per mutex

2013-04-04 Thread Waiman Long
-7.7% | +0.2% |+1.3%| +--+-+-+-+ It can be seen that this patch improves performance for the fserver and new_fserver workloads while suffering some slight drop in performance for the other workloads. Signed-off-by: Waima

[PATCH RFC 1/3] mutex: Make more scalable by doing less atomic operations

2013-04-04 Thread Waiman Long
% | +--+---++-+ Signed-off-by: Waiman Long Reviewed-by: Davidlohr Bueso --- arch/x86/include/asm/mutex.h | 16 kernel/mutex.c |9 ++--- kernel/mutex.h |8 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mutex.h b

[PATCH RFC 0/3] mutex: Improve mutex performance by doing less atomic-ops & spinning

2013-04-04 Thread Waiman Long
%| +--+-+-+--+ So patch 2 is better at low and high load. Patch 3 is better at intermediate load. For other AIM7 workloads, patch 3 is generally better. Waiman Long (3): mutex: Make more scalable by doing less atomic operations mutex: restrict mutex spinning to only

[PATCH RFC 3/3] mutex: dynamically disable mutex spinning at high load

2013-04-04 Thread Waiman Long
new_fserver workloads while is still generally positive for the other AIM7 workloads. Signed-off-by: Waiman Long Reviewed-by: Davidlohr Bueso --- kernel/sched/core.c | 22 ++ 1 files changed, 22 insertions(+), 0 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c

[PATCH v2 4/4] dcache: don't need to take d_lock in prepend_path()

2013-04-05 Thread Waiman Long
ken. Signed-off-by: Waiman Long --- fs/dcache.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 9477d80..e3d6543 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2529,6 +2529,7 @@ static int prepend_name(char **buffer, int *buflen, struct q

[PATCH v2 RFC 3/4] dcache: change rename_lock to a sequence read/write lock

2013-04-05 Thread Waiman Long
/write lock declaration and access functions. When apply this patch to 3.8 or earlier releases, the unused function d_path_with_unreachable() in fs/dcache.c should be removed to avoid compilation warning. Signed-off-by: Waiman Long --- fs/autofs4/waitq.c |6 ++-- fs/ceph/mds_client.c

[PATCH v2 0/4] dcache: make dcache more scalable on large system

2013-04-05 Thread Waiman Long
ependent on patch 2. The other 2 patches are independent can be applied individually. Signed-off-by: Waiman Long Waiman Long (4): dcache: Don't take unnecessary lock in d_count update dcache: introduce a new sequence read/write lock type dcache: change rename_lock to a sequence read/

[PATCH RFC v2 2/4] dcache: introduce a new sequence read/write lock type

2013-04-05 Thread Waiman Long
at writers may be starved if there is a lot of contention. Signed-off-by: Waiman Long --- include/linux/seqrwlock.h | 137 + 1 files changed, 137 insertions(+), 0 deletions(-) create mode 100644 include/linux/seqrwlock.h diff --git a/include/linux/s

[PATCH v2 1/4] dcache: Don't take unnecessary lock in d_count update

2013-04-05 Thread Waiman Long
10-100 users | 200-1000 users | 1100-2000 users | +--+---++-+ | high_systime | -0.1% | -0.2% | +1.2% | +--+---++-----+ Signed-off-by: Waiman Long --- fs/dc

Re: [PATCH v2 1/4] dcache: Don't take unnecessary lock in d_count update

2013-04-05 Thread Waiman Long
On 04/05/2013 01:12 PM, Al Viro wrote: @@ -635,22 +640,14 @@ struct dentry *dget_parent(struct dentry *dentry) { struct dentry *ret; -repeat: - /* -* Don't need rcu_dereference because we re-check it was correct under -* the lock. -*/ rcu_read_lock

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-28 Thread Waiman Long
On 02/22/2013 07:13 PM, Andi Kleen wrote: That seems to me like an application problem - poking at what the kernel is doing via diagnostic interfaces so often that it gets in the way of the kernel actually doing stuff is not a problem the kernel can solve. I agree with you that the application s

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-28 Thread Waiman Long
On 02/28/2013 03:39 PM, Waiman Long wrote: activity level. Most of the d_path() call last for about 1ms. There are a couple of those that last for more than 10ms. A correction. The time unit here should be us, not ms. Sorry for the mistake. -Longman -- To unsubscribe from this list: send

[PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-19 Thread Waiman Long
s less spinlock content in functions like dput(), but the function itself ran a little bit longer on average. The git-diff test showed no difference in performance. There is a slight increase in system time compensated by a slight decrease in user time. Signed-off-by: Waiman Long Waiman Long

[PATCH 1/4] dcache: Don't take unncessary lock in d_count update

2013-02-19 Thread Waiman Long
en. Depending on how frequent the cmpxchg instruction is used (d_count > 1 or 2), the new code can be faster or slower than the original one. Signed-off-by: Waiman Long --- fs/dcache.c| 23 ++ fs/namei.c |2 +- include/li

[PATCH 4/4] dcache: don't need to take d_lock in prepend_path()

2013-02-19 Thread Waiman Long
ken. Signed-off-by: Waiman Long --- fs/dcache.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index b1487e2..0e911fc 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2547,6 +2547,7 @@ static int prepend_name(char **buffer, int *buflen, struct q

[PATCH 3/4] dcache: change rename_lock to a sequence read/write lock

2013-02-19 Thread Waiman Long
/write lock declaration and access functions. Signed-off-by: Waiman Long --- fs/autofs4/waitq.c |6 ++-- fs/ceph/mds_client.c |4 +- fs/cifs/dir.c |4 +- fs/dcache.c| 87 --- fs/nfs/namespace.c |6

[PATCH 2/4] dcache: introduce a new sequence read/write lock type

2013-02-19 Thread Waiman Long
at writers may be starved if there is a lot of contention. Signed-off-by: Waiman Long --- include/linux/seqrwlock.h | 138 + 1 files changed, 138 insertions(+), 0 deletions(-) create mode 100644 include/linux/seqrwlock.h diff --git a/include/linux/s

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-21 Thread Waiman Long
On 02/21/2013 07:13 PM, Andi Kleen wrote: Dave Chinner writes: On Tue, Feb 19, 2013 at 01:50:55PM -0500, Waiman Long wrote: It was found that the Oracle database software issues a lot of call to the seq_path() kernel function which translates a (dentry, mnt) pair to an absolute path. The

Re: [PATCH RFC v2 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation

2013-08-26 Thread Waiman Long
On 08/22/2013 09:28 AM, Alexander Fyodorov wrote: 22.08.2013, 05:04, "Waiman Long": On 08/21/2013 11:51 AM, Alexander Fyodorov wrote: In this case, we should have smp_wmb() before freeing the lock. The question is whether we need to do a full mb() instead. The x86 ticket spinlock unlo

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Waiman Long
On 08/28/2013 09:40 PM, Linus Torvalds wrote: Just FYI: I've merged two preparatory patches in my tree for the whole lockref thing. Instead of applying your four patches as-is during the merge window, I ended up writing two patches that introduce the concept and use it in the dentry code *without

Re: [PATCH RFC v2 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation

2013-08-29 Thread Waiman Long
On 08/27/2013 08:09 AM, Alexander Fyodorov wrote: I also thought that the x86 spinlock unlock path was an atomic add. It just comes to my realization recently that this is not the case. The UNLOCK_LOCK_PREFIX will be mapped to "" except for some old 32-bit x86 processors. Hmm, I didn't know that

Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file

2013-10-01 Thread Waiman Long
On 10/01/2013 12:48 PM, Tim Chen wrote: On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock

Re: [PATCH] rwsem: reduce spinlock contention in wakeup code path

2013-10-01 Thread Waiman Long
On 10/01/2013 03:33 AM, Ingo Molnar wrote: * Waiman Long wrote: I think Waiman's patches (even the later ones) made the queued rwlocks be a side-by-side implementation with the old rwlocks, and I think that was just being unnecessarily careful. It might be useful for testing to have a c

Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file

2013-10-01 Thread Waiman Long
On 10/01/2013 05:16 PM, Tim Chen wrote: On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: The cpu could still be executing out of order load instruction from the critical section before checking node->locked? Probably smp_mb() is still needed. Tim But this is the lock function

[PATCH v4 2/3] qrwlock x86: Enable x86 to use queue read/write lock

2013-10-02 Thread Waiman Long
x86 which tends to have the largest NUMA machines compared with the other architectures. This patch will improve the scalability of those large machines. Signed-off-by: Waiman Long --- arch/x86/Kconfig |1 + arch/x86/include/asm/spinlock.h |2 ++ arch/x86

[PATCH v4 0/3] qrwlock: Introducing a queue read/write lock implementation

2013-10-02 Thread Waiman Long
esting is done. Signed-off-by: Waiman Long Waiman Long (3): qrwlock: A queue read/write lock implementation qrwlock x86: Enable x86 to use queue read/write lock qrwlock: Enable fair queue read/write lock arch/x86/Kconfig |1 + arch/x86/include/asm/spinlock.h

[PATCH v4 3/3] qrwlock: Enable fair queue read/write lock

2013-10-02 Thread Waiman Long
: Waiman Long --- include/linux/rwlock.h | 15 +++ include/linux/rwlock_types.h | 13 + lib/spinlock_debug.c | 19 +++ 3 files changed, 47 insertions(+), 0 deletions(-) diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h index

[PATCH v4 1/3] qrwlock: A queue read/write lock implementation

2013-10-02 Thread Waiman Long
] [k] perf_event_aux_ctx 1.01% reaim [kernel.kallsyms] [k] perf_event_aux Tim Chen also tested the qrwlock with Ingo's patch on a 4-socket machine. It was found the performance improvement of 11% was the same with regular rwlock or queue rwlock. Signed-off-by: Waiman Long --- includ

Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file

2013-10-02 Thread Waiman Long
On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node->lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node->lock =

Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file

2013-10-02 Thread Waiman Long
On 10/02/2013 02:43 PM, Tim Chen wrote: On Tue, 2013-10-01 at 21:25 -0400, Waiman Long wrote: If the lock and unlock functions are done right, there should be no overlap of critical section. So it is job of the lock/unlock functions to make sure that critical section code won't leak out.

Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file

2013-10-02 Thread Waiman Long
On 10/02/2013 03:30 PM, Jason Low wrote: On Wed, Oct 2, 2013 at 12:19 PM, Waiman Long wrote: On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node->lock = 0 at the top of

Re: [PATCH v5 01/12] spinlock: A new lockref structure for lockless update of refcount

2013-07-08 Thread Waiman Long
On 07/05/2013 02:59 PM, Thomas Gleixner wrote: On Fri, 5 Jul 2013, Waiman Long wrote: + * If the spinlock& reference count optimization feature is disabled, + * the spinlock and reference count are accessed separately on its own. + */ +struct lockref { + unsigned int re

Re: [PATCH v5 00/12] Lockless update of reference count protected by spinlock

2013-07-08 Thread Waiman Long
On 07/05/2013 04:33 PM, Thomas Gleixner wrote: On Fri, 5 Jul 2013, Waiman Long wrote: patch 1:Introduce the new lockref data structure patch 2:Enable x86 architecture to use the feature patch 3:Rename all d_count references to d_refcount And after that the mail

Re: [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call

2013-07-08 Thread Waiman Long
On 07/08/2013 12:30 PM, Paul Moore wrote: On Friday, July 05, 2013 01:10:32 PM Waiman Long wrote: On 06/11/2013 07:49 AM, Stephen Smalley wrote: On 06/10/2013 01:55 PM, Waiman Long wrote: ... Signed-off-by: Waiman Long Acked-by: Stephen Smalley Thank for the Ack. Will that patch go into

[PATCH v6 11/14] nilfs2: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long Acked-by: Ryusuke Konishi --- fs/n

[PATCH v6 02/14] spinlock: Enable x86 architecture to do lockless refcount update

2013-07-08 Thread Waiman Long
This patch enables the x86 architecture to do lockless reference count update using the generic lockref implementation with default parameters. Only the x86/Kconfig file needs to be changed. Signed-off-by: Waiman Long --- arch/x86/Kconfig |3 +++ 1 files changed, 3 insertions(+), 0

[PATCH v6 01/14] spinlock: A new lockref structure for lockless update of refcount

2013-07-08 Thread Waiman Long
the old code path of acquiring a lock before doing the update. Similarly, this is controlled by the LOCKREF_RETRY_COUNT macro. Signed-off-by: Waiman Long --- include/asm-generic/spinlock_refcount.h | 46 +++ include/linux/spinlock_refcount.h | 142

[PATCH v6 14/14] dcache: Enable lockless update of refcount in dentry structure

2013-07-08 Thread Waiman Long
won't work for those macros and so the errors should be ignored. Signed-off-by: Waiman Long --- fs/dcache.c| 18 -- include/linux/dcache.h | 17 ++--- 2 files changed, 22 insertions(+), 13 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c in

[PATCH v6 12/14] lustre-fs: Use the standard d_count() helper to access refcount

2013-07-08 Thread Waiman Long
The Lustre FS should use the newly defined d_count() helper function to access the dentry's reference count instead of defining its own d_refcount() macro for the same purpose. Since the current lustre code is marked as broken, no build test was attempted for this change. Signed-off-by: W

[PATCH v6 13/14] dcache: rename d_count field of dentry to d_refcount

2013-07-08 Thread Waiman Long
change the reference count value. They will be modified to use a different reference count name "d_refcount" which is unique in the kernel source code. Signed-off-by: Waiman Long --- fs/dcache.c| 54 fs/namei.c

[PATCH v6 10/14] nfs: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/nfs/dir.c|6 +++--- f

[PATCH v6 09/14] file locking: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/locks.c |2 +- 1 files chang

[PATCH v6 03/14] dcache: Add a new helper function d_count() to return refcount

2013-07-08 Thread Waiman Long
This patch adds a new helper function d_count() in dcache.h for returning the current reference count of the dentry object. It should be used by all the files outside of the core dcache.c and namei.c files. Signed-off-by: Waiman Long --- include/linux/dcache.h | 10 ++ 1 files changed

[PATCH v6 08/14] ecrypt-fs: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/ecryptfs/inode.c |2 +- 1

[PATCH v6 06/14] coda-fs: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/coda/dir.c |2 +- 1 files ch

[PATCH v6 05/14] ceph-fs: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/ceph/inode.c |4 ++-

[PATCH v6 00/14] Lockless update of reference count protected by spinlock

2013-07-08 Thread Waiman Long
in shaping this patchset. Signed-off-by: Waiman Long Waiman Long (14): spinlock: A new lockref structure for lockless update of refcount spinlock: Enable x86 architecture to do lockless refcount update dcache: Add a new helper function d_count() to return refcount auto-fs: replace direc

[PATCH v6 07/14] config-fs: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/configfs/dir.c |2 +- 1

[PATCH v6 04/14] auto-fs: replace direct access of d_count with the d_count() helper

2013-07-08 Thread Waiman Long
All readonly references to d_count outside of the core dcache code should be changed to use the new d_count() helper as they shouldn't access its value directly. There is no change in logic and everything should just work. Signed-off-by: Waiman Long --- fs/autofs4/expire.c |8 ---

Re: [PATCH v6 03/14] dcache: Add a new helper function d_count() to return refcount

2013-07-11 Thread Waiman Long
On 07/08/2013 09:09 PM, Waiman Long wrote: This patch adds a new helper function d_count() in dcache.h for returning the current reference count of the dentry object. It should be used by all the files outside of the core dcache.c and namei.c files. I want to know people's thought of spi

[PATCH RFC 2/2] x86 qrwlock: Enable x86 to use queue read/write lock

2013-07-12 Thread Waiman Long
This patch makes the necessary changes at the x86 architecture specific layer to enable the presence of the CONFIG_QUEUE_RWLOCK kernel option to replace the plain read/write lock by the queue read/write lock. Signed-off-by: Waiman Long --- arch/x86/Kconfig |3 +++ arch

[PATCH RFC 0/2] qrwlock: Introducing a queue read/write lock implementation

2013-07-12 Thread Waiman Long
just replacing the current read/write lock with the queue read/write lock, we can have a faster and more deterministic system. Signed-off-by: Waiman Long Waiman Long (2): qrwlock: A queue read/write lock implementation x86 qrwlock: Enable x86 to use queue read/write lock arch/x86/Kconfig

[PATCH RFC 1/2] qrwlock: A queue read/write lock implementation

2013-07-12 Thread Waiman Long
0.9% |+0.9% | |new_fserver (HT off)|-1.2%| +29.8% | +40.5% | ++-+--+---+ Signed-off-by: Waiman Long --- include/asm-generic/qrwlock.h | 124 + lib/Kconfig | 11 ++

Re: [PATCH] lockref: use cmpxchg64 explicitly for lockless updates

2013-09-19 Thread Waiman Long
On 09/19/2013 02:11 PM, Linus Torvalds wrote: On Thu, Sep 19, 2013 at 1:06 PM, Will Deacon wrote: The cmpxchg() function tends not to support 64-bit arguments on 32-bit architectures. This could be either due to use of unsigned long arguments (like on ARM) or lack of instruction support (cmpxch

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/03/2013 03:09 PM, Linus Torvalds wrote: On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may actually fix your AIM7 test-cas

[PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-04 Thread Waiman Long
by the running of perf will go away and we will have a more accurate perf profile. Signed-off-by: Waiman Long --- fs/dcache.c | 118 +-- 1 files changed, 82 insertions(+), 36 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 96

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/04/2013 11:14 AM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 7:52 AM, Waiman Long wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen. Good. And this was

Re: [PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-04 Thread Waiman Long
On 09/04/2013 03:05 PM, Waiman Long wrote: When running the AIM7's short workload, Linus' lockref patch eliminated most of the spinlock contention. However, there were still some left: 8.46% reaim [kernel.kallsyms] [k] _raw_spin_lock |--42.21

Re: [PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-04 Thread Waiman Long
On 09/04/2013 03:11 PM, Al Viro wrote: On Wed, Sep 04, 2013 at 03:05:23PM -0400, Waiman Long wrote: static int prepend_name(char **buffer, int *buflen, struct qstr *name) { - return prepend(buffer, buflen, name->name, name->len); + /* +* With RCU path tracing,

Re: [PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-04 Thread Waiman Long
On 09/04/2013 03:43 PM, Al Viro wrote: On Wed, Sep 04, 2013 at 03:33:00PM -0400, Waiman Long wrote: I have thought about that. But if a d_move() is going on, the string in the buffer will be discarded as the sequence number will change. So whether or not it have embedded null byte shouldn&#

Re: [PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-04 Thread Waiman Long
On 09/04/2013 04:40 PM, John Stoffel wrote: "Waiman" == Waiman Long writes: Waiman> In term of AIM7 performance, this patch has a performance boost of Waiman> about 6-7% on top of Linus' lockref patch on a 8-socket 80-core DL980. Waiman> User Range | 10

Re: [PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-04 Thread Waiman Long
On 09/04/2013 05:31 PM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 12:05 PM, Waiman Long wrote: + rcu_read_unlock(); + if (read_seqretry(&rename_lock, seq)) + goto restart; Btw, you have this pattern twice, and while it's not necessarily incorrect, i

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/04/2013 05:34 PM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 12:25 PM, Waiman Long wrote: Yes, the perf profile was taking from an 80-core machine. There isn't any scalability issue hiding for the short workload on an 80-core machine. However, I am certain that more may pop up

Re: [PATCH] dcache: Translating dentry into pathname without taking rename_lock

2013-09-05 Thread Waiman Long
On 09/05/2013 12:30 AM, George Spelvin wrote: As long as you're removing locks from prepend_name and complicating its innards, I notice that each and every call site follows it by prepending "/". How about moving that into prepend_name as well? Also, if you happen to feel like it, you can delet

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Waiman Long
On 09/05/2013 09:31 AM, Ingo Molnar wrote: * Waiman Long wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen. The perf profile was: 5.85% reaim reaim

Re: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Waiman Long
On 09/05/2013 11:31 AM, Linus Torvalds wrote: On Thu, Sep 5, 2013 at 6:18 AM, Heiko Carstens wrote: *If* however the cpu_relax() makes sense on other platforms maybe we could add something like we have already with "arch_mutex_cpu_relax()": I actually think it won't. The lockref cmpxchg isn'

[PATCH v2 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-05 Thread Waiman Long
tion contributed by running perf without this patch was about 16%. With this patch, the spinlock contention caused by the running of perf will go away and we will have a more accurate perf profile. Signed-off-by: Waiman Long --- fs/dcache.c | 213 +

[PATCH v2 0/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-05 Thread Waiman Long
ock. - Make code re-factoring suggested by George Spelvin. Waiman Long (1): dcache: Translating dentry into pathname without taking rename_lock fs/dcache.c | 213 ++- 1 files changed, 151 insertions(+), 62 deletions(-) -- To unsubscr

Re: [PATCH v2 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-05 Thread Waiman Long
On 09/05/2013 03:35 PM, Linus Torvalds wrote: No. Stop all these stupid games. No d_lock, no trying to make d_name/d_len consistent. No "compare d_name against d_iname". No EINVAL. No idiotic racy "let's fetch each byte one-by one and test them against NUL", which is just racy and stupid. Ju

Re: [PATCH v2 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-05 Thread Waiman Long
On 09/05/2013 04:04 PM, Al Viro wrote: On Thu, Sep 05, 2013 at 02:55:16PM -0400, Waiman Long wrote: + const char *dname = ACCESS_ONCE(dentry->d_name.name); + u32 dlen = dentry->d_name.len; + int error; + + if (likely(dname == (const char *)dentry-&g

Re: [PATCH v2 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-05 Thread Waiman Long
On 09/05/2013 04:42 PM, Linus Torvalds wrote: On Thu, Sep 5, 2013 at 1:29 PM, Waiman Long wrote: It is not as simple as doing a strncpy(). Yes it damn well is. Stop the f*cking stupid arguments, and instead listen to what I say. Here. Let me bold-face the most important part for you, so

[PATCH v3 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-06 Thread Waiman Long
When taking the perf profile of the high-systime workload, the amount of spinlock contention contributed by running perf without this patch was about 16%. With this patch, the spinlock contention caused by the running of perf will go away and we will have a more accurate perf profile. Signed-off-

[PATCH v3 0/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-06 Thread Waiman Long
safety. - Replace memchr() by a byte-by-byte checking for loop. - Try lockless dentry to pathname conversion 3 times before falling back to taking the rename_lock to prevent live-lock. - Make code re-factoring suggested by George Spelvin. Waiman Long (1): dcache: Translating dentry into pathname w

Re: [PATCH v3 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-06 Thread Waiman Long
On 09/06/2013 04:52 PM, Linus Torvalds wrote: On Fri, Sep 6, 2013 at 9:08 AM, Waiman Long wrote: This patch will replace the writer's write_seqlock/write_sequnlock sequence of the rename_lock of the callers of the prepend_path() and __dentry_path() functions with the reader's rea

Re: [PATCH v3 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
On 09/07/2013 02:07 PM, Al Viro wrote: On Sat, Sep 07, 2013 at 10:52:02AM -0700, Linus Torvalds wrote: So I think we could make a more complicated data structure that looks something like this: struct seqlock_retry { unsigned int seq_no; int state; }; and pass that aroun

[PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
When taking the perf profile of the high-systime workload, the amount of spinlock contention contributed by running perf without this patch was about 16%. With this patch, the spinlock contention caused by the running of perf will go away and we will have a more accurate perf profile. Signed-off-

[PATCH v4 0/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
name_lock to prevent live-lock. - Make code re-factoring suggested by George Spelvin. Waiman Long (1): dcache: Translating dentry into pathname without taking rename_lock fs/dcache.c | 196 --- 1 files changed, 133 insertions(+), 63

Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
On 09/09/2013 01:45 PM, Linus Torvalds wrote: On Mon, Sep 9, 2013 at 10:29 AM, Al Viro wrote: I'm not sure I like mixing rcu_read_lock() into that - d_path() and friends can do that themselves just fine (it needs to be taken when seq is even), and e.g. d_walk() doesn't need it at all. Other th

Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
On 09/09/2013 01:29 PM, Al Viro wrote: On Mon, Sep 09, 2013 at 12:18:13PM -0400, Waiman Long wrote: +/** + * read_seqbegin_or_lock - begin a sequence number check or locking block + * lock: sequence lock + * seq : sequence number to be checked + * + * First try it once optimistically without

Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
On 09/09/2013 02:36 PM, Al Viro wrote: On Mon, Sep 09, 2013 at 07:21:11PM +0100, Al Viro wrote: Actually, it's better for prepend_path() as well, because it's actually rcu_read_lock(); seq = read_seqbegin(&rename_lock); again: if (error) goto

Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
On 09/09/2013 03:28 PM, Al Viro wrote: On Mon, Sep 09, 2013 at 08:10:29PM +0100, Al Viro wrote: On Mon, Sep 09, 2013 at 02:46:57PM -0400, Waiman Long wrote: I am fine with your proposed change as long as it gets the job done. I suspect that the real problem is the unlock part of

Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock

2013-09-09 Thread Waiman Long
On 09/09/2013 08:40 PM, George Spelvin wrote: I'm really wondering about only trying once before taking the write lock. Yes, using the lsbit is a cute hack, but are we using it for its cuteness rather than its effectiveness? Renames happen occasionally. If that causes all the current pathname t

Re: kernel BUG at fs/dcache.c:648! with v3.11-7890-ge5c832d

2013-09-10 Thread Waiman Long
On 09/10/2013 04:25 PM, Linus Torvalds wrote: On Tue, Sep 10, 2013 at 12:57 PM, Mace Moneta wrote: The (first) patch looks good; no recurrence. It has only taken 3-5 minutes before, and I've been up for about half an hour now. Ok, good. It's pushed out. Al, your third pile of VFS stuff is als

[PATCH 2/2] dcache: use read_seqlock/unlock() in read_seqbegin_or_lock() & friend

2013-09-11 Thread Waiman Long
code. Signed-off-by: Waiman Long --- fs/dcache.c | 31 --- 1 files changed, 16 insertions(+), 15 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 4d9df3c..8191ca5 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -90,8 +90,8 @@ static struct kmem_cache *dentry_

[PATCH 1/2] seqlock: Add a new blocking reader type

2013-09-11 Thread Waiman Long
. This new blocking reader will not block other non-blocking readers, but will block other blocking readers and writers. Signed-off-by: Waiman Long --- include/linux/seqlock.h | 65 +++--- 1 files changed, 60 insertions(+), 5 deletions(-) diff --git a/inc

Re: [PATCH 1/2] seqlock: Add a new blocking reader type

2013-09-11 Thread Waiman Long
On 09/11/2013 10:55 AM, Al Viro wrote: On Wed, Sep 11, 2013 at 10:28:26AM -0400, Waiman Long wrote: The sequence lock (seqlock) was originally designed for the cases where the readers do not need to block the writers by making the readers retry the read operation when the data change. Since

Re: [PATCH 1/2] seqlock: Add a new blocking reader type

2013-09-11 Thread Waiman Long
On 09/11/2013 01:26 PM, Al Viro wrote: On Wed, Sep 11, 2013 at 12:33:35PM -0400, Waiman Long wrote: Folks, any suggestions on better names? The semantics we are getting is I will welcome any better name suggestion and will incorporate that in the patch. FWIW, the suggestions I've se

Re: [3.12-rc1] Dependency on module-init-tools >= 3.11 ?

2013-09-12 Thread Waiman Long
On 09/12/2013 06:29 AM, Herbert Xu wrote: On Thu, Sep 12, 2013 at 07:20:23PM +0900, Tetsuo Handa wrote: Herbert Xu wrote: The trouble is not all distros will include the softdep modules in the initramfs. So for now I think we will have to live with a fallback. I see. Herbert Xu wrote: OK, c

[PATCH 0/2 v2] dcache: get/release read lock in read_seqbegin_or_lock() & friend

2013-09-12 Thread Waiman Long
Change log -- v1->v2: - Rename the new seqlock primitives to read_seqexcl_lock* and read_seqexcl_unlock*. - Clarify in the commit log and comments about the exclusive nature of the read lock. Waiman Long (2): seqlock: Add a new locking reader type dcache: get/release r

[PATCH 2/2 v2] dcache: get/release read lock in read_seqbegin_or_lock() & friend

2013-09-12 Thread Waiman Long
urate comments in the code. Signed-off-by: Waiman Long --- fs/dcache.c | 31 --- 1 files changed, 16 insertions(+), 15 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 4d9df3c..9e88367 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -90,8 +90,8 @@ static s

[PATCH 1/2 v2] seqlock: Add a new locking reader type

2013-09-12 Thread Waiman Long
he seqlock locking mechanism. This new locking reader will try to take an exclusive lock preventing other writers and locking readers from going forward. However, it won't affect the progress of the other sequence number reading readers as the sequence number won't be changed. Signed

Re: [PATCH 0/2 v2] dcache: get/release read lock in read_seqbegin_or_lock() & friend

2013-09-12 Thread Waiman Long
On 09/12/2013 12:38 PM, Linus Torvalds wrote: On Thu, Sep 12, 2013 at 7:55 AM, Waiman Long wrote: Change log -- v1->v2: - Rename the new seqlock primitives to read_seqexcl_lock* and read_seqexcl_unlock*. Applied. Except I peed in the snow and renamed the functions again.T

Re: [PATCH 0/2 v2] dcache: get/release read lock in read_seqbegin_or_lock() & friend

2013-09-12 Thread Waiman Long
On 09/12/2013 01:30 PM, Linus Torvalds wrote: On Thu, Sep 12, 2013 at 9:38 AM, Linus Torvalds wrote: On Thu, Sep 12, 2013 at 7:55 AM, Waiman Long wrote: Change log -- v1->v2: - Rename the new seqlock primitives to read_seqexcl_lock* and read_seqexcl_unlock*. Applied.

[PATCH] perf: Fix potential compilation error with some compilers

2013-10-08 Thread Waiman Long
‘long unsigned int’, but argument 2 has type ‘__u64’ This patch replaces PRIu64 which is "lu" by the explicit "llu" to fix this problem as __u64 is of type "long long unsigned". Signed-off-by: Waiman Long --- .../perf/util/scripting-engines/trace-event-perl.c |

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Waiman Long
On 08/29/2013 07:42 PM, Linus Torvalds wrote: Waiman? Mind looking at this and testing? Linus Sure, I will try out the patch tomorrow morning and see how it works out for my test case. Regards, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: [PATCH RFC v2 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation

2013-08-29 Thread Waiman Long
On 08/29/2013 01:03 PM, Alexander Fyodorov wrote: 29.08.2013, 19:25, "Waiman Long": What I have been thinking is to set a flag in an architecture specific header file to tell if the architecture need a memory barrier. The generic code will then either do a smp_mb() or barrier() depend

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-30 Thread Waiman Long
On 08/29/2013 11:54 PM, Linus Torvalds wrote: On Thu, Aug 29, 2013 at 8:12 PM, Waiman Long wrote: On 08/29/2013 07:42 PM, Linus Torvalds wrote: Waiman? Mind looking at this and testing? Linus Sure, I will try out the patch tomorrow morning and see how it works out for my test case. Ok

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-30 Thread Waiman Long
On 08/30/2013 02:53 PM, Linus Torvalds wrote: So the perf data would be *much* more interesting for a more varied load. I know pretty much exactly what happens with my silly test-program, and as you can see it never really gets to the actual spinlock, because that test program will only ever hi

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-30 Thread Waiman Long
On 08/30/2013 03:40 PM, Al Viro wrote: On Fri, Aug 30, 2013 at 03:20:48PM -0400, Waiman Long wrote: There are more contention in the lglock than I remember for the run in 3.10. This is an area that I need to look at. In fact, lglock is becoming a problem for really large machine with a lot of

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-30 Thread Waiman Long
On 08/30/2013 03:33 PM, Linus Torvalds wrote: On Fri, Aug 30, 2013 at 12:20 PM, Waiman Long wrote: Below is the perf data of my short workloads run in an 80-core DL980: Ok, that doesn't look much like d_lock any more. Sure, there's a small amount of spinlocking going on with loc

  1   2   3   4   5   6   7   8   9   10   >