Re: [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k

2023-09-10 Thread Waiman Long
On 9/10/23 04:28, guo...@kernel.org wrote: From: Guo Ren The target of xchg_tail is to write the tail to the lock value, so adding prefetchw could help the next cmpxchg step, which may decrease the cmpxchg retry loops of xchg_tail. Some processors may utilize this feature to give a forward gu

Re: [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line

2023-09-11 Thread Waiman Long
On 9/10/23 04:29, guo...@kernel.org wrote: From: Guo Ren Allow cmdline to force the kernel to use queued_spinlock when CONFIG_RISCV_COMBO_SPINLOCKS=y. Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/riscv/kernel/setup.c

Re: [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line

2023-09-11 Thread Waiman Long
On 9/10/23 04:29, guo...@kernel.org wrote: From: Guo Ren Allow cmdline to force the kernel to use queued_spinlock when CONFIG_RISCV_COMBO_SPINLOCKS=y. Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/riscv/kernel/setup.c

Re: [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k

2023-09-11 Thread Waiman Long
On 9/10/23 23:09, Guo Ren wrote: On Mon, Sep 11, 2023 at 10:35 AM Waiman Long wrote: On 9/10/23 04:28, guo...@kernel.org wrote: From: Guo Ren The target of xchg_tail is to write the tail to the lock value, so adding prefetchw could help the next cmpxchg step, which may decrease the cmpxchg

Re: [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k

2023-09-13 Thread Waiman Long
On 9/13/23 08:52, Guo Ren wrote: On Wed, Sep 13, 2023 at 4:55 PM Leonardo Bras wrote: On Tue, Sep 12, 2023 at 09:10:08AM +0800, Guo Ren wrote: On Mon, Sep 11, 2023 at 9:03 PM Waiman Long wrote: On 9/10/23 23:09, Guo Ren wrote: On Mon, Sep 11, 2023 at 10:35 AM Waiman Long wrote: On 9/10

Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK

2023-09-13 Thread Waiman Long
On 9/13/23 14:54, Palmer Dabbelt wrote: On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sor...@fastmail.com wrote: On Wed, Aug 2, 2023, at 12:46 PM, guo...@kernel.org wrote: From: Guo Ren According to qspinlock requirements, RISC-V gives out a weak LR/SC forward progress guarantee which does not sa

Re: [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line

2023-09-14 Thread Waiman Long
On 9/14/23 03:32, Leonardo Bras wrote: On Tue, Sep 12, 2023 at 09:08:34AM +0800, Guo Ren wrote: On Mon, Sep 11, 2023 at 11:34 PM Waiman Long wrote: On 9/10/23 04:29, guo...@kernel.org wrote: From: Guo Ren Allow cmdline to force the kernel to use queued_spinlock when

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-23 Thread Waiman Long
On 7/22/19 8:46 AM, peter enderborg wrote: > On 7/2/19 8:37 PM, Waiman Long wrote: >> Currently, a value of '1" is written to /sys/kernel/slab//shrink >> file to shrink the slab by flushing all the per-cpu slabs and free >> slabs in partial lists. This applies

[PATCH-next] ipc: Fix race condition in ipc_idr_alloc()

2019-03-10 Thread Waiman Long
concurrent ipc_obtain_object_check() will not incorrectly match a deleted IPC id to to a new one. Reported-by: Manfred Spraul Signed-off-by: Waiman Long --- ipc/util.c | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/ipc/util.c b/ipc/util.c index 78

[PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs

2019-06-24 Thread Waiman Long
ach memory cgroup. This is to reduce the slab_mutex hold time to minimize impact to other running applications that may need to acquire the mutex. The slab shrinking feature is only available when CONFIG_MEMCG_KMEM is defined as the code need to access slab_root_caches to iterate all the root

[PATCH 0/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs

2019-06-24 Thread Waiman Long
iterate on all the memory cgroups. Waiman Long (2): mm, memcontrol: Add memcg_iterate_all() mm, slab: Extend vm/drop_caches to shrink kmem slabs Documentation/sysctl/vm.txt | 11 -- fs/drop_caches.c| 4 include/linux/memcontrol.h | 3 +++ include/linux/slab.h| 1

[PATCH 1/2] mm, memcontrol: Add memcg_iterate_all()

2019-06-24 Thread Waiman Long
Add a memcg_iterate_all() function for iterating all the available memory cgroups and call the given callback function for each of the memory cgruops. Signed-off-by: Waiman Long --- include/linux/memcontrol.h | 3 +++ mm/memcontrol.c| 13 + 2 files changed, 16

Re: [PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs

2019-06-27 Thread Waiman Long
On 6/26/19 4:19 PM, Roman Gushchin wrote: >> >> +#ifdef CONFIG_MEMCG_KMEM >> +static void kmem_cache_shrink_memcg(struct mem_cgroup *memcg, >> +void __maybe_unused *arg) >> +{ >> +struct kmem_cache *s; >> + >> +if (memcg == root_mem_cgroup) >> +

Re: [PATCH 1/2] mm, memcontrol: Add memcg_iterate_all()

2019-06-27 Thread Waiman Long
On 6/27/19 11:07 AM, Michal Hocko wrote: > On Mon 24-06-19 13:42:18, Waiman Long wrote: >> Add a memcg_iterate_all() function for iterating all the available >> memory cgroups and call the given callback function for each of the >> memory cgruops. > Why is a trivial wrap

Re: [PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs

2019-06-27 Thread Waiman Long
On 6/27/19 11:15 AM, Michal Hocko wrote: > On Mon 24-06-19 13:42:19, Waiman Long wrote: >> With the slub memory allocator, the numbers of active slab objects >> reported in /proc/slabinfo are not real because they include objects >> that are held by the per-cpu slab struct

Re: [PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs

2019-06-27 Thread Waiman Long
On 6/27/19 5:24 PM, Roman Gushchin wrote: >>> 2) what's your long-term vision here? do you think that we need to shrink >>>kmem_caches periodically, depending on memory pressure? how a user >>>will use this new sysctl? >> Shrinking the kmem caches under extreme memory pressure can be one wa

[PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-02 Thread Waiman Long
322796 kB SUnreclaim: 372856 kB # grep task_struct /proc/slabinfo task_struct 2262 2572 774448 : tunables00 0 : slabdata643 643 0 Signed-off-by: Waiman Long --- Documentation/ABI/testing/sysfs-kernel-slab | 10 +++-- mm/slab.h

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-02 Thread Waiman Long
On 7/2/19 2:37 PM, Waiman Long wrote: > Currently, a value of '1" is written to /sys/kernel/slab//shrink > file to shrink the slab by flushing all the per-cpu slabs and free > slabs in partial lists. This applies only to the root caches, though. > > Extends this capabi

Re: [PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs

2019-07-02 Thread Waiman Long
On 6/28/19 3:31 AM, Michal Hocko wrote: > On Thu 27-06-19 17:16:04, Waiman Long wrote: >> On 6/27/19 11:15 AM, Michal Hocko wrote: >>> On Mon 24-06-19 13:42:19, Waiman Long wrote: >>>> With the slub memory allocator, the numbers of active slab objects >>>&g

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-02 Thread Waiman Long
On 7/2/19 3:09 PM, David Rientjes wrote: > On Tue, 2 Jul 2019, Waiman Long wrote: > >> diff --git a/Documentation/ABI/testing/sysfs-kernel-slab >> b/Documentation/ABI/testing/sysfs-kernel-slab >> index 29601d93a1c2..2a3d0fc4b4ac 100644 >> --- a/Documentation/ABI/test

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-02 Thread Waiman Long
On 7/2/19 4:03 PM, Andrew Morton wrote: > On Tue, 2 Jul 2019 14:37:30 -0400 Waiman Long wrote: > >> Currently, a value of '1" is written to /sys/kernel/slab//shrink >> file to shrink the slab by flushing all the per-cpu slabs and free >> slabs in partial li

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-03 Thread Waiman Long
On 7/3/19 2:56 AM, Michal Hocko wrote: > On Tue 02-07-19 14:37:30, Waiman Long wrote: >> Currently, a value of '1" is written to /sys/kernel/slab//shrink >> file to shrink the slab by flushing all the per-cpu slabs and free >> slabs in partial lists. This applies

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-03 Thread Waiman Long
On 7/3/19 10:37 AM, Michal Hocko wrote: > On Wed 03-07-19 09:12:13, Waiman Long wrote: >> On 7/3/19 2:56 AM, Michal Hocko wrote: >>> On Tue 02-07-19 14:37:30, Waiman Long wrote: >>>> Currently, a value of '1" is written to /sys/kernel/slab//shrink >>

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-03 Thread Waiman Long
On 7/2/19 5:33 PM, Andrew Morton wrote: > On Tue, 2 Jul 2019 16:44:24 -0400 Waiman Long wrote: > >> On 7/2/19 4:03 PM, Andrew Morton wrote: >>> On Tue, 2 Jul 2019 14:37:30 -0400 Waiman Long wrote: >>> >>>> Currently, a value of '1" is written

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-03 Thread Waiman Long
On 7/3/19 12:10 PM, Christopher Lameter wrote: > On Wed, 3 Jul 2019, Waiman Long wrote: > >> On 7/3/19 2:56 AM, Michal Hocko wrote: >>> On Tue 02-07-19 14:37:30, Waiman Long wrote: >>>> Currently, a value of '1" is written to /sys/kernel/slab//shrink &

Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches

2019-07-03 Thread Waiman Long
On 7/3/19 11:53 AM, Michal Hocko wrote: > On Wed 03-07-19 11:21:16, Waiman Long wrote: >> On 7/2/19 5:33 PM, Andrew Morton wrote: >>> On Tue, 2 Jul 2019 16:44:24 -0400 Waiman Long wrote: >>> >>>> On 7/2/19 4:03 PM, Andrew Morton wrote: >>>>>

[PATCH-tip v5 20/21] sched, TP-futex: Make wake_up_q() return wakeup count

2017-02-03 Thread Waiman Long
Unlike wake_up_process(), wake_up_q() doesn't tell us how many tasks have been woken up. This information can sometimes be useful for tracking purpose. So wake_up_q() is now modified to return that information. Signed-off-by: Waiman Long --- include/linux/sched/wake_q.h | 2 +- kernel/fu

[PATCH-tip v5 09/21] futex: Introduce throughput-optimized (TP) futexes

2017-02-03 Thread Waiman Long
% 40 15,383,536 27,687,160 +80% 50 13,290,368 23,096,937 +74% 1008,577,763 14,410,909 +68% 1us sleep 176,450 179,660 +2% Signed-off-by: Waiman Long --- include/uapi/linux/futex.h | 4 + kernel

[PATCH-tip v5 10/21] TP-futex: Enable robust handling

2017-02-03 Thread Waiman Long
ned-off-by: Waiman Long --- kernel/futex.c | 85 +- 1 file changed, 79 insertions(+), 6 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 6a59e6d..46a1a4b 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1006,7 +1006,7 @@ sta

[PATCH-tip v5 19/21] perf bench: Extend mutex/rwlock futex suite to test TP futexes

2017-02-03 Thread Waiman Long
This patch extends the futex-mutex and futex-rwlock microbenchmarks to test userspace mutexes and rwlocks built on top of the TP futexes. We can then compare the relative performance of those userspace locks based on different type of futexes. Signed-off-by: Waiman Long --- tools/perf/bench

[PATCH-tip v5 21/21] futex: Dump internal futex state via debugfs

2017-02-03 Thread Waiman Long
For debugging purpose, it is sometimes useful to dump the internal states in the futex hash bucket table. This patch adds a file "futex_hash_table" in debugfs root filesystem to dump the internal futex states. Signed-off-by: Waiman Long --- kernel/fu

[PATCH-tip v5 18/21] TP-futex, doc: Update TP futexes document on shared locking

2017-02-03 Thread Waiman Long
The tp-futex.txt was updated to add description about shared locking support. Signed-off-by: Waiman Long --- Documentation/tp-futex.txt | 163 +++-- 1 file changed, 143 insertions(+), 20 deletions(-) diff --git a/Documentation/tp-futex.txt b

[PATCH-tip v5 15/21] TP-futex: Support userspace reader/writer locks

2017-02-03 Thread Waiman Long
write lock/unlock code. This is by design to minimize any additional overhead for mutex lock and unlock. As a result, the TP futex rwlock prefers writers a bit more than readers. Signed-off-by: Waiman Long --- include/uapi/linux/futex.h | 28 +- kernel/fu

[PATCH-tip v5 16/21] TP-futex: Enable kernel reader lock stealing

2017-02-03 Thread Waiman Long
mechanism hasn't been enabled yet. A new field locksteal_disabled is added to the futex state object for controlling reader lock stealing. So the waiting reader must retrieve the futex state object first before doing it. Signed-off-by: Waiman Long --- kernel/futex.c

[PATCH-tip v5 17/21] TP-futex: Group readers together in wait queue

2017-02-03 Thread Waiman Long
iming. Lock starvation should not happen on the TP futexes as long as the underlying kernel mutex is lock starvation free which is the case for 4.10 and later kernel. Signed-off-by: Waiman Long --- kernel/futex.c | 136 +++-- 1 file changed

[PATCH-tip v5 13/21] TP-futex: Add timeout support

2017-02-03 Thread Waiman Long
waiting in the serialization mutex. Signed-off-by: Waiman Long --- kernel/futex.c | 50 ++ 1 file changed, 42 insertions(+), 8 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 22f7906..91b2e02 100644 --- a/kernel/futex.c +++ b/kern

[PATCH-tip v5 07/21] futex: Add a new futex type field into futex_state

2017-02-03 Thread Waiman Long
As the futex_state structure will be overloaded in later patches to be used by non-PI futexes, it is necessary to add a type field to distinguish among different types of futexes. Signed-off-by: Waiman Long --- kernel/futex.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions

[PATCH-tip v5 12/21] TP-futex: Return status code on FUTEX_LOCK calls

2017-02-03 Thread Waiman Long
lock is acquired. 2) Bits 08-15: reserved 3) Bits 16-30: how many time the task sleeps in the optimistic spinning loop. By returning the TP status code, an external monitoring or tracking program can have a macro view of how the TP futexes are performing. Signed-off-by: Waiman Long --- kernel

[PATCH-tip v5 08/21] futex: Allow direct attachment of futex_state objects to hash bucket

2017-02-03 Thread Waiman Long
nk the futex state objects as well as a new spinlock to manage them are added to the hash bucket. To limit size increase for UP systems, these new fields are only for SMP machines where the cacheline alignment of the hash bucket leaves it with enough empty space for the new fields. Signed-off-by: Waima

[PATCH-tip v5 14/21] TP-futex, doc: Add TP futexes documentation

2017-02-03 Thread Waiman Long
This patch adds a new document file on how to use the TP futexes. Signed-off-by: Waiman Long --- Documentation/00-INDEX | 2 + Documentation/tp-futex.txt | 161 + 2 files changed, 163 insertions(+) create mode 100644 Documentation/tp-futex.txt

[PATCH-tip v5 02/21] perf bench: New microbenchmark for userspace rwlock performance

2017-02-03 Thread Waiman Long
= 0.2% Per-thread Locking Rates: Avg = 75,392 ops/sec (+- 0.30%) Min = 71,259 ops/sec Max = 78,211 ops/sec Signed-off-by: Waiman Long --- tools/perf/Documentation/perf-bench.txt | 3 + tools/perf/bench/bench.h| 1 + tools/perf/bench/futex-locks.c

[PATCH-tip v5 06/21] futex: Consolidate pure pi_state_list add & delete codes to helpers

2017-02-03 Thread Waiman Long
Two new helper functions (task_pi_list_add & task_pi_list_del) are created to consolidate all the pure pi_state_list addition and insertion codes. The set_owner argument in task_pi_list_add() will be needed in a later patch. Signed-off-by: Waiman Long --- kernel/futex.c

[PATCH-tip v5 11/21] TP-futex: Implement lock handoff to prevent lock starvation

2017-02-03 Thread Waiman Long
: Waiman Long --- kernel/futex.c | 59 -- 1 file changed, 53 insertions(+), 6 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 46a1a4b..348b44c 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -63,6 +63,7 @@ #include

[PATCH-tip v5 03/21] futex: Consolidate duplicated timer setup code

2017-02-03 Thread Waiman Long
A new futex_setup_timer() helper function is added to consolidate all the hrtimer_sleeper setup code. Signed-off-by: Waiman Long --- kernel/futex.c | 67 -- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/kernel/futex.c b

[PATCH-tip v5 05/21] futex: Add helpers to get & cmpxchg futex value without lock

2017-02-03 Thread Waiman Long
Two new helper functions cmpxchg_futex_value() and get_futex_value() are added to access and change the futex value without the hash bucket lock. As a result, page fault is enabled and the page will be faulted in if not present yet. Signed-off-by: Waiman Long --- kernel/futex.c | 15

[PATCH-tip v5 04/21] futex: Rename futex_pi_state to futex_state

2017-02-03 Thread Waiman Long
states are also renamed. Signed-off-by: Waiman Long --- include/linux/sched.h | 4 +- kernel/futex.c| 107 +- 2 files changed, 56 insertions(+), 55 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index e9d5503

[PATCH-tip v5 00/21] futex: Introducing throughput-optimized (TP) futexes

2017-02-03 Thread Waiman Long
the performance characteristics of the TP futexes when implementing userspace mutex and rwlock respectively when compared with other possible way of doing so via the wait-wake futexes. Waiman Long (21): perf bench: New microbenchmark for userspace mutex performance perf bench: New microbenchmark

[PATCH-tip v5 01/21] perf bench: New microbenchmark for userspace mutex performance

2017-02-03 Thread Waiman Long
% Exclusive unlock futex calls = 8.9% EAGAIN lock errors = 99.4% Process wakeups = 0.8% Per-thread Locking Rates: Avg = 126,202 ops/sec (+- 0.21%) Min = 120,114 ops/sec Max = 131,375 ops/sec Signed-off-by: Waiman Long --- tools/perf/Documentation/perf-bench.txt | 2

Re: [PATCH-tip v5 17/21] TP-futex: Group readers together in wait queue

2017-02-03 Thread Waiman Long
On 02/03/2017 01:23 PM, valdis.kletni...@vt.edu wrote: > On Fri, 03 Feb 2017 13:03:50 -0500, Waiman Long said: > >> On a 2-socket 36-core E5-2699 v3 system (HT off) running on a 4.10 >> WW futex TP f

[PATCH-tip v6 01/22] perf bench: New microbenchmark for userspace mutex performance

2017-03-22 Thread Waiman Long
slowpaths = 23.9% Exclusive unlock slowpaths = 32.6% EAGAIN lock errors = 93.7% Process wakeups = 13.2% Per-thread Locking Rates: Avg = 112,057 ops/sec (+- 0.56%) Min = 105,377 ops/sec Max = 119,632 ops/sec Signed-off-by: Waiman Long --- tools/perf

[PATCH-tip v6 06/22] futex: Consolidate pure pi_state_list add & delete codes to helpers

2017-03-22 Thread Waiman Long
Two new helper functions (task_pi_list_add & task_pi_list_del) are created to consolidate all the pure pi_state_list addition and insertion codes. The set_owner argument in task_pi_list_add() will be needed in a later patch. Signed-off-by: Waiman Long --- kernel/futex.c

[PATCH-tip v6 00/22] futex: Introducing throughput-optimized (TP) futexes

2017-03-22 Thread Waiman Long
ion about the performance characteristics of the TP futexes when implementing userspace mutex and rwlock respectively when compared with other possible ways of doing so via the wait-wake futexes. Waiman Long (22): perf bench: New microbenchmark for userspace mutex performance perf bench: New mic

[PATCH-tip v6 18/22] TP-futex: Group readers together in wait queue

2017-03-22 Thread Waiman Long
disbanded when the group leader goes to sleep. In this case, all those readers will go into the mutex wait queue alone and wait for their turn to acquire the TP futex. Signed-off-by: Waiman Long --- kernel/futex.c | 143 +++-- 1 file changed, 140

[PATCH-tip v6 19/22] TP-futex, doc: Update TP futexes document on shared locking

2017-03-22 Thread Waiman Long
500,372 2,160,265 - Read unlock slowpaths 1,210,698 1 - In this case, the TP futex is more than 2X the performance of the WW futex. Signed-off-by: Waiman Long --- Documentation/tp-futex.txt | 170 +++-- 1 file

[PATCH-tip v6 21/22] sched, TP-futex: Make wake_up_q() return wakeup count

2017-03-22 Thread Waiman Long
Unlike wake_up_process(), wake_up_q() doesn't tell us how many tasks have been woken up. This information can sometimes be useful for tracking purpose. So wake_up_q() is now modified to return that information. Signed-off-by: Waiman Long --- include/linux/sched/wake_q.h | 2 +- kernel/fu

[PATCH-tip v6 12/22] TP-futex: Return status code on FUTEX_LOCK calls

2017-03-22 Thread Waiman Long
lock is acquired. 2) Bits 08-15: reserved 3) Bits 16-30: how many time the task sleeps in the optimistic spinning loop. By returning the TP status code, an external monitoring or tracking program can have a macro view of how the TP futexes are performing. Signed-off-by: Waiman Long --- kernel

[PATCH-tip v6 22/22] futex: Dump internal futex state via debugfs

2017-03-22 Thread Waiman Long
For debugging purpose, it is sometimes useful to dump the internal states in the futex hash bucket table. This patch adds a file "futex_hash_table" in debugfs root filesystem to dump the internal futex states. Signed-off-by: Waiman Long --- kernel/fu

[PATCH-tip v6 13/22] TP-futex: Add timeout support

2017-03-22 Thread Waiman Long
types due to the fact that timer expiration can't be detected when the thread is waiting in the serialization mutex. Signed-off-by: Waiman Long --- kernel/futex.c | 52 +++- 1 file changed, 43 insertions(+), 9 deletions(-) diff --git a/kernel

[PATCH-tip v6 16/22] TP-futex: Support userspace reader/writer locks

2017-03-22 Thread Waiman Long
write lock/unlock code. This is by design to minimize any additional overhead for mutex lock and unlock code. Signed-off-by: Waiman Long --- include/uapi/linux/futex.h | 28 +- kernel/futex.c | 228 ++--- 2 files changed, 221 insertions(+)

[PATCH-tip v6 11/22] TP-futex: Implement lock handoff to prevent lock starvation

2017-03-22 Thread Waiman Long
: Waiman Long --- kernel/futex.c | 59 -- 1 file changed, 53 insertions(+), 6 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index af367e8..b71c411 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -63,6 +63,7 @@ #include

[PATCH-tip v6 20/22] perf bench: Extend mutex/rwlock futex suite to test TP futexes

2017-03-22 Thread Waiman Long
This patch extends the futex-mutex and futex-rwlock microbenchmarks to test userspace mutexes and rwlocks built on top of the TP futexes. We can then compare the relative performance of those userspace locks based on different type of futexes. Signed-off-by: Waiman Long --- tools/perf/bench

[PATCH-tip v6 17/22] TP-futex: Enable kernel reader lock stealing

2017-03-22 Thread Waiman Long
set, it will enable kernel reader to steal the lock when the futex is currently reader-owned and the lock handoff mechanism hasn't been enabled yet. Signed-off-by: Waiman Long --- kernel/futex.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/kernel/fu

[PATCH-tip v6 10/22] TP-futex: Enable robust handling

2017-03-22 Thread Waiman Long
ned-off-by: Waiman Long --- kernel/futex.c | 85 +- 1 file changed, 79 insertions(+), 6 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 7270552..af367e8 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1006,7 +1006,7 @@ sta

[PATCH-tip v6 09/22] futex: Introduce throughput-optimized (TP) futexes

2017-03-22 Thread Waiman Long
which can be used for other locking primitives like conditional variables, or semaphores. So it is not a direct replacement of wait-wake futexes. Signed-off-by: Waiman Long --- include/uapi/linux/futex.h | 4 + kernel/futex.c | 569 - 2

[PATCH-tip v6 07/22] futex: Add a new futex type field into futex_state

2017-03-22 Thread Waiman Long
As the futex_state structure will be overloaded in later patches to be used by non-PI futexes, it is necessary to add a type field to distinguish among different types of futexes. Signed-off-by: Waiman Long --- kernel/futex.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions

[PATCH-tip v6 14/22] TP-futex: Optionally return EAGAIN for userspace locking

2017-03-22 Thread Waiman Long
case. Doing locking in the userspace can lead to lock starvation in some cases unless some precautionary measure is taken. So it is recommended that kernel locking should be performed after a number of failures in userspace locking. Signed-off-by: Waiman Long --- kernel/futex.c | 53

[PATCH-tip v6 15/22] TP-futex, doc: Add TP futexes documentation

2017-03-22 Thread Waiman Long
1,758,403 13 - Signed-off-by: Waiman Long --- Documentation/00-INDEX | 2 + Documentation/tp-futex.txt | 180 + 2 files changed, 182 insertions(+) create mode 100644 Documentation/tp-futex.txt diff --git a/Documentation/00-INDEX

[PATCH-tip v6 05/22] futex: Add helpers to get & cmpxchg futex value without lock

2017-03-22 Thread Waiman Long
Two new helper functions cmpxchg_futex_value() and get_futex_value() are added to access and change the futex value without the hash bucket lock. As a result, page fault is enabled and the page will be faulted in if not present yet. Signed-off-by: Waiman Long --- kernel/futex.c | 15

[PATCH-tip v6 08/22] futex: Allow direct attachment of futex_state objects to hash bucket

2017-03-22 Thread Waiman Long
nk the futex state objects as well as a new spinlock to manage them are added to the hash bucket. To limit size increase for UP systems, these new fields are only for SMP machines where the cacheline alignment of the hash bucket leaves it with enough empty space for the new fields. Signed-off-by: Waima

[PATCH-tip v6 03/22] futex: Consolidate duplicated timer setup code

2017-03-22 Thread Waiman Long
A new futex_setup_timer() helper function is added to consolidate all the hrtimer_sleeper setup code. Signed-off-by: Waiman Long --- kernel/futex.c | 67 -- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/kernel/futex.c b

[PATCH-tip v6 04/22] futex: Rename futex_pi_state to futex_state

2017-03-22 Thread Waiman Long
states are also renamed. Signed-off-by: Waiman Long --- include/linux/sched.h | 4 +- kernel/futex.c| 107 +- 2 files changed, 56 insertions(+), 55 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d67eee8

[PATCH-tip v6 02/22] perf bench: New microbenchmark for userspace rwlock performance

2017-03-22 Thread Waiman Long
wakeups = 1.6% Shared Lock Batch Stats: Total shared lock batches= 10,569,959 Avg batch size = 1.6 Max batch size = 26 Per-thread Locking Rates: Avg = 93,841 ops/sec (+- 0.40%) Min = 89,269 ops/sec Max = 101,021 ops/sec Signed-off-by: Waiman

[RFC PATCH 00/14] cgroup: Implement cgroup v2 thread mode & CPU controller

2017-04-21 Thread Waiman Long
isc preps for cgroup unified hierarchy interface sched: Implement interface for cgroup unified hierarchy Waiman Long (7): cgroup: Fix reference counting bug in cgroup_procs_write() cgroup: Move debug cgroup to its own file cgroup: Keep accurate count of tasks in each css_set cgroup: Make

[RFC PATCH 12/14] sched: Implement interface for cgroup unified hierarchy

2017-04-21 Thread Waiman Long
From: Tejun Heo While the cpu controller doesn't have any functional problems, there are a couple interface issues which can be addressed in the v2 interface. * cpuacct being a separate controller. This separation is artificial and rather pointless as demonstrated by most use cases co-mountin

[RFC PATCH 13/14] sched: Make cpu/cpuacct threaded controllers

2017-04-21 Thread Waiman Long
Make cpu and cpuacct cgroup controllers usable within a threaded cgroup. Signed-off-by: Waiman Long --- kernel/sched/core.c| 1 + kernel/sched/cpuacct.c | 1 + 2 files changed, 2 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 78dfcaa..9d8beda 100644 --- a/kernel

[RFC PATCH 10/14] cgroup: Implement new thread mode semantics

2017-04-21 Thread Waiman Long
e the competition between internal processes and child cgroups at the thread root. This model will be flexible enough to support the need of the threaded controllers. Signed-off-by: Waiman Long --- Documentation/cgroup-v2.txt | 51 +++ kernel/cgroup/cgroup-internal.h | 10 +++ kernel/c

[RFC PATCH 14/14] cgroup: Enable separate control knobs for thread root internal processes

2017-04-21 Thread Waiman Long
the memcg to showcase its effect. Signed-off-by: Waiman Long --- Documentation/cgroup-v2.txt | 20 include/linux/cgroup-defs.h | 15 ++ kernel/cgroup/cgroup.c | 122 +++- kernel/cgroup/debug.c | 6 +++ mm/memcontrol.c |

[RFC PATCH 11/14] sched: Misc preps for cgroup unified hierarchy interface

2017-04-21 Thread Waiman Long
From: Tejun Heo Make the following changes in preparation for the cpu controller interface implementation for the unified hierarchy. This patch doesn't cause any functional differences. * s/cpu_stats_show()/cpu_cfs_stats_show()/ * s/cpu_files/cpu_legacy_files/ * Separate out cpuacct_stats_rea

[RFC PATCH 08/14] cgroup: Keep accurate count of tasks in each css_set

2017-04-21 Thread Waiman Long
count. This new variable is protected by the css_set_lock. Functions that require the actual task count are updated to use the new variable. Signed-off-by: Waiman Long --- include/linux/cgroup-defs.h | 3 +++ kernel/cgroup/cgroup-v1.c | 6 +- kernel/cgroup/cgroup.c | 5 + kernel

[RFC PATCH 09/14] cgroup: Make debug cgroup support v2 and thread mode

2017-04-21 Thread Waiman Long
_read() function now prints out the addresses of the css'es associated with the current css_set. 7) A new cgroup_subsys_states file is added to display the css objects associated with a cgroup. Signed-off-by: Waiman Long --- kernel/cgroup/deb

[RFC PATCH 06/14] cgroup: Fix reference counting bug in cgroup_procs_write()

2017-04-21 Thread Waiman Long
counting error. Signed-off-by: Waiman Long --- kernel/cgroup/cgroup-internal.h | 2 +- kernel/cgroup/cgroup-v1.c | 2 +- kernel/cgroup/cgroup.c | 10 ++ 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup

[RFC PATCH 07/14] cgroup: Move debug cgroup to its own file

2017-04-21 Thread Waiman Long
debug_taskcount_read() function. Signed-off-by: Waiman Long --- kernel/cgroup/Makefile| 1 + kernel/cgroup/cgroup-v1.c | 147 - kernel/cgroup/debug.c | 165 ++ 3 files changed, 166 insertions(+), 147

[RFC PATCH 04/14] cgroup: implement CSS_TASK_ITER_THREADED

2017-04-21 Thread Waiman Long
From: Tejun Heo cgroup v2 is in the process of growing thread granularity support. Once thread mode is enabled, the root cgroup of the subtree serves as the proc_cgrp to which the processes of the subtree conceptually belong and domain-level resource consumptions not tied to any specific task are

[RFC PATCH 05/14] cgroup: implement cgroup v2 thread support

2017-04-21 Thread Waiman Long
From: Tejun Heo This patch implements cgroup v2 thread support. The goal of the thread mode is supporting hierarchical accounting and control at thread granularity while staying inside the resource domain model which allows coordination across different resource controllers and handling of anony

[RFC PATCH 02/14] cgroup: add @flags to css_task_iter_start() and implement CSS_TASK_ITER_PROCS

2017-04-21 Thread Waiman Long
From: Tejun Heo css_task_iter currently always walks all tasks. With the scheduled cgroup v2 thread support, the iterator would need to handle multiple types of iteration. As a preparation, add @flags to css_task_iter_start() and implement CSS_TASK_ITER_PROCS. If the flag is not specified, it

[RFC PATCH 03/14] cgroup: introduce cgroup->proc_cgrp and threaded css_set handling

2017-04-21 Thread Waiman Long
From: Tejun Heo cgroup v2 is in the process of growing thread granularity support. Once thread mode is enabled, the root cgroup of the subtree serves as the proc_cgrp to which the processes of the subtree conceptually belong and domain-level resource consumptions not tied to any specific task are

[RFC PATCH 01/14] cgroup: reorganize cgroup.procs / task write path

2017-04-21 Thread Waiman Long
From: Tejun Heo Currently, writes "cgroup.procs" and "cgroup.tasks" files are all handled by __cgroup_procs_write() on both v1 and v2. This patch reoragnizes the write path so that there are common helper functions that different write paths use. While this somewhat increases LOC, the different

Re: [RFC PATCH 00/14] cgroup: Implement cgroup v2 thread mode & CPU controller

2017-04-26 Thread Waiman Long
On 04/21/2017 10:03 AM, Waiman Long wrote: > This patchset incorporates the following 2 patchsets from Tejun Heo: > > 1) cgroup v2 thread mode patchset (5 patches) > https://lkml.org/lkml/2017/2/2/592 > 2) CPU Controller on Control Group v2 (2 patches) > https://lkml

[RFC PATCH v2 00/17] cgroup: Major changes to cgroup v2 core

2017-05-15 Thread Waiman Long
t sched: Misc preps for cgroup unified hierarchy interface sched: Implement interface for cgroup unified hierarchy Waiman Long (10): cgroup: Fix reference counting bug in cgroup_procs_write() cgroup: Prevent kill_css() from being called more than once cgroup: Move debug cgroup to its own file

[RFC PATCH v2 16/17] sched: Implement interface for cgroup unified hierarchy

2017-05-15 Thread Waiman Long
From: Tejun Heo While the cpu controller doesn't have any functional problems, there are a couple interface issues which can be addressed in the v2 interface. * cpuacct being a separate controller. This separation is artificial and rather pointless as demonstrated by most use cases co-mountin

[RFC PATCH v2 17/17] sched: Make cpu/cpuacct threaded controllers

2017-05-15 Thread Waiman Long
Make cpu and cpuacct cgroup controllers usable within a threaded cgroup. Signed-off-by: Waiman Long --- kernel/sched/core.c| 1 + kernel/sched/cpuacct.c | 1 + 2 files changed, 2 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b041081..479f69e 100644 --- a/kernel

[RFC PATCH v2 12/17] cgroup: Remove cgroup v2 no internal process constraint

2017-05-15 Thread Waiman Long
ribution between internal processes as a group and other child cgroups. Signed-off-by: Waiman Long --- Documentation/cgroup-v2.txt | 76 ++- include/linux/cgroup-defs.h | 15 +++ kernel/cgroup/cgroup-internal.h | 1 - kernel/cgroup/cgroup-v1.c | 3 - ke

[RFC PATCH v2 15/17] sched: Misc preps for cgroup unified hierarchy interface

2017-05-15 Thread Waiman Long
From: Tejun Heo Make the following changes in preparation for the cpu controller interface implementation for the unified hierarchy. This patch doesn't cause any functional differences. * s/cpu_stats_show()/cpu_cfs_stats_show()/ * s/cpu_files/cpu_legacy_files/ * Separate out cpuacct_stats_rea

[RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

2017-05-15 Thread Waiman Long
e the competition between internal processes and child cgroups at the thread root. This model will be flexible enough to support the need of the threaded controllers. Signed-off-by: Waiman Long --- Documentation/cgroup-v2.txt | 51 +++ kernel/cgroup/cgroup-internal.h | 10 +++ kernel/c

[RFC PATCH v2 07/17] cgroup: Prevent kill_css() from being called more than once

2017-05-15 Thread Waiman Long
The kill_css() function may be called more than once under the condition that the css was killed but not physically removed yet followed by the removal of the cgroup that is hosting the css. This patch prevents any harmm from being done when that happens. Signed-off-by: Waiman Long --- include

[RFC PATCH v2 14/17] cgroup: Enable printing of v2 controllers' cgroup hierarchy

2017-05-15 Thread Waiman Long
This patch add a new debug control file on the cgroup v2 root directory to print out the cgroup hierarchy for each of the v2 controllers. Signed-off-by: Waiman Long --- kernel/cgroup/debug.c | 141 ++ 1 file changed, 141 insertions(+) diff --git

[RFC PATCH v2 13/17] cgroup: Allow fine-grained controllers control in cgroup v2

2017-05-15 Thread Waiman Long
hy that can be quite different from other controllers. We now have the freedom and flexibility to create the right hierarchy for each controller to suit their own needs without performance loss when compared with cgroup v1. Signed-off-by: Waiman Long --- Documentation

[RFC PATCH v2 10/17] cgroup: Make debug cgroup support v2 and thread mode

2017-05-15 Thread Waiman Long
p. Signed-off-by: Waiman Long --- kernel/cgroup/debug.c | 196 +- 1 file changed, 179 insertions(+), 17 deletions(-) diff --git a/kernel/cgroup/debug.c b/kernel/cgroup/debug.c index ada53e6..3121811 100644 --- a/kernel/cgroup/debug.c +++ b/ker

[RFC PATCH v2 08/17] cgroup: Move debug cgroup to its own file

2017-05-15 Thread Waiman Long
debug_taskcount_read() function. Signed-off-by: Waiman Long --- kernel/cgroup/Makefile| 1 + kernel/cgroup/cgroup-v1.c | 147 - kernel/cgroup/debug.c | 165 ++ 3 files changed, 166 insertions(+), 147

[RFC PATCH v2 06/17] cgroup: Fix reference counting bug in cgroup_procs_write()

2017-05-15 Thread Waiman Long
counting error. Signed-off-by: Waiman Long --- kernel/cgroup/cgroup-internal.h | 2 +- kernel/cgroup/cgroup-v1.c | 2 +- kernel/cgroup/cgroup.c | 10 ++ 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup

  1   2   3   4   5   >