Re: [PATCH V2 Resend 3/4] workqueue: Schedule work on non-idle cpu instead of current one

2012-11-26 Thread Tejun Heo
Hello, Viresh. On Tue, Nov 06, 2012 at 04:08:45PM +0530, Viresh Kumar wrote: > Workqueues queues work on current cpu, if the caller haven't passed a > preferred > cpu. This may wake up an idle CPU, which is actually not required. > > This work can be processed by any CPU and so we must select a

Why is cpuset_cpus_allowed_fallback() necessary?

2012-11-26 Thread Tejun Heo
Hello, guys. I'm wondering why cpuset_cpus_allowed_fallback() is necessary. This is called from, e.g., try_to_wake_up()->select_task_rq() when none of the cpus in ->cpus_allowed is useable. The cpuset callback invokes do_set_cpus_allowed() w/ the cpuset's cpus_allowed. This was added by the fol

Re: [patch] workqueue: exit rescuer_thread() as TASK_RUNNING

2012-11-28 Thread Tejun Heo
On Wed, Nov 28, 2012 at 07:17:18AM +0100, Mike Galbraith wrote: > > A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling > off, never to be seen again. In the case where this occurred, an exiting > thread hit reiserfs homebrew conditional resched while holding a mutex, > bring

Re: [PATCH] cgroup: avoid creating degenerate allcg_list

2012-11-28 Thread Tejun Heo
Hey, Greg. On Wed, Nov 28, 2012 at 10:26:32AM -0800, Greg Thelen wrote: > Before this patch init_cgroup_root() created a degenerate list by > first inserting a element into allcg_list and then initializing the > inserted list element. The initialization reset the element's > prev/next fields form

Re: [PATCH] cgroup: fix lockdep warning for event_control

2012-11-28 Thread Tejun Heo
Hello, Greg. On Wed, Nov 28, 2012 at 12:15:42PM -0800, Greg Thelen wrote: > @@ -4276,6 +4276,7 @@ static int cgroup_destroy_locked(struct cgroup *cgrp) > DEFINE_WAIT(wait); > struct cgroup_event *event, *tmp; > struct cgroup_subsys *ss; > + struct list_head tmp_list; LIST_HE

[PATCH 01/13] cpuset: remove unused cpuset_unlock()

2012-11-28 Thread Tejun Heo
Signed-off-by: Tejun Heo --- kernel/cpuset.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index b017887..a423774 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -2412,17 +2412,6 @@ int __cpuset_node_allowed_hardwall(int node, gfp_t

[PATCH 04/13] cpuset: introduce CS_ONLINE

2012-11-28 Thread Tejun Heo
Add CS_ONLINE which is set from css_online() and cleared from css_offline(). This will enable using generic cgroup iterator while allowing decoupling cpuset from cgroup internal locking. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion

[PATCH 05/13] cpuset: introduce cpuset_for_each_child()

2012-11-28 Thread Tejun Heo
n't visible to all the iterations and this patch currently doesn't make any functional difference. This will be used to de-couple cpuset locking from cgroup core. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 85 - 1 fil

[PATCH 09/13] cpuset: don't nest cgroup_mutex inside get_online_cpus()

2012-11-28 Thread Tejun Heo
own+0x36/0x50 [] store_online+0x5d/0xe0 [] dev_attr_store+0x18/0x30 [] sysfs_write_file+0xe0/0x150 [] vfs_write+0xa8/0x160 [] sys_write+0x52/0xa0 [] system_call_fastpath+0x16/0x1b Signed-off-by: Tejun Heo --- kernel/cpuset.c | 28 1 file changed, 24 insertions(+)

[PATCH 11/13] cpuset: pin down cpus and mems while a task is being attached

2012-11-28 Thread Tejun Heo
e() treats cpusets w/ non-zero ->attach_in_progress like cpusets w/ tasks and refuses to remove all cpus or mems from it. This currently doesn't make any functional difference as everything is protected by cgroup_mutex but enables decoupling the locking. Signed-off-by: Tejun Heo

[PATCH 13/13] cpuset: replace cgroup_mutex locking with cpuset internal locking

2012-11-28 Thread Tejun Heo
. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 186 1 file changed, 106 insertions(+), 80 deletions(-) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 79be3f0..2ee0e03 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -208,23

[PATCH 12/13] cpuset: schedule hotplug propagation from cpuset_attach() if the cpuset is empty

2012-11-28 Thread Tejun Heo
g. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 68a0906..79be3f0 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -266,6 +266,7 @@ static struct workqueue_struct *cpuset_propagate_hotplug_w

[PATCH 10/13] cpuset: make CPU / memory hotplug propagation asynchronous

2012-11-28 Thread Tejun Heo
while holding cgroup_mutex and waits for completion without cgroup_mutex. Each in-flight propagation holds a reference to the cpuset->css. This patch doesn't cause any functional difference. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 54 -

[PATCH 02/13] cpuset: remove fast exit path from remove_tasks_in_empty_cpuset()

2012-11-28 Thread Tejun Heo
The function isn't that hot, the overhead of missing the fast exit is low, the test itself depends heavily on cgroup internals, and it's gonna be a hindrance when trying to decouple cpuset locking from cgroup core. Remove the fast exit path. Signed-off-by: Tejun Heo --- kernel/cp

[PATCH 06/13] cpuset: cleanup cpuset[_can]_attach()

2012-11-28 Thread Tejun Heo
cpuset_attach() and make the global variables static ones inside cpuset_attach(). While at it, convert cpus_attach to cpumask_t from cpumask_var_t. There's no reason to mess with dynamic allocation on a static buffer. Signed-off-by: Tejun Heo --- kernel/cpuset.c

[PATCH 08/13] cpuset: reorganize CPU / memory hotplug handling

2012-11-28 Thread Tejun Heo
le_hotplug() can handle multiple resources going up and down. These properties will allow async operation. The reorganization, while drastic, is equivalent and shouldn't cause any behavior difference. This will enable making hotplug handling async and remove get_online_cpus() -> cgroup_mut

[PATCH 07/13] cpuset: drop async_rebuild_sched_domains()

2012-11-28 Thread Tejun Heo
e CPU / memory hotplug path still grabs the two locks in the reverse order and thus this is a deadlock hazard; however, the two locks are already deadlock-prone and the hotplug path will be updated by further patches. Signed-off-by: Tejun Heo --- kernel/cpuset.c

[PATCH 03/13] cpuset: introduce ->css_on/offline()

2012-11-28 Thread Tejun Heo
() frees. This doesn't introduce any visible behavior changes. This will help cleaning up locking. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 66 ++--- 1 file changed, 44 insertions(+), 22 deletions(-) diff --git a/kernel/cpuset.c b/k

[PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-11-28 Thread Tejun Heo
Hello, guys. Depending on cgroup core locking - cgroup_mutex - is messy and makes cgroup prone to locking dependency problems. The current code already has lock dependency loop - memcg nests get_online_cpus() inside cgroup_mutex. cpuset the other way around. Regardless of the locking details, w

Re: [PATCH 2/2] cgroup: list_del_init() on removed events

2012-11-28 Thread Tejun Heo
On Wed, Nov 28, 2012 at 01:50:45PM -0800, Greg Thelen wrote: > Use list_del_init() rather than list_del() to remove events from > cgrp->event_list. No functional change. This is just defensive > coding. > > Signed-off-by: Greg Thelen Applied 1-2 to cgroup/for-3.8. Thanks! -- tejun -- To uns

[PATCHSET cgroup/for-3.8] cpuset: drop cpuset->stack_list and ->parent

2012-11-28 Thread Tejun Heo
Hello, guys. cpuset implements its own descendant iteration using cpuset->stack_list and has its own ->parent pointer. There's nothing cpuset specific about descendant walking or finding the parent. This patchset makes cpuset use cgroup generic API instead. 0001-cpuset-implement-cgroup_rightmo

[PATCH 2/3] cpuset: replace cpuset->stack_list with cpuset_for_each_descendant_pre()

2012-11-28 Thread Tejun Heo
Implement cpuset_for_each_descendant_pre() and replace the cpuset-specific tree walking using cpuset->stack_list with it. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 123 ++-- 1 file changed, 48 insertions(+), 75 deletions(-) diff --gi

[PATCH 3/3] cpuset: remove cpuset->parent

2012-11-28 Thread Tejun Heo
cgroup already tracks the hierarchy. Follow cgroup->parent to find the parent and drop cpuset->parent. Signed-off-by: Tejun Heo --- kernel/cpuset.c | 28 +--- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 3

[PATCH 1/3] cpuset: implement cgroup_rightmost_descendant()

2012-11-28 Thread Tejun Heo
Implement cgroup_rightmost_descendant() which returns the right most descendant of the specified cgroup. This can be used to skip the cgroup's subtree while iterating with cgroup_for_each_descendant_pre(). Signed-off-by: Tejun Heo Cc: Michal Hocko --- include/linux/cgroup.h | 1 + k

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-11-29 Thread Tejun Heo
Hello, Glauber. On Thu, Nov 29, 2012 at 03:14:41PM +0400, Glauber Costa wrote: > On 11/29/2012 01:34 AM, Tejun Heo wrote: > > This patchset decouples cpuset locking from cgroup_mutex. After the > > patchset, cpuset uses cpuset-specific cpuset_mutex instead of > > cgroup_mute

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-11-29 Thread Tejun Heo
On Thu, Nov 29, 2012 at 06:26:46AM -0800, Tejun Heo wrote: > > What I'll try to do, is to come with another specialized lock in cgroup > > just for this case. So after taking the cgroup lock, we would also take > > an extra lock if we are adding another entry - be it task o

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
Hello, Steven. On Mon, Mar 18, 2013 at 10:36:23AM -0400, Steven Rostedt wrote: > kernel BUG at kernel/sched/core.c:1731! > invalid opcode: [#1] PREEMPT SMP > CPU 5 > Pid: 16637, comm: kworker/5:0 Not tainted 3.6.11-rt30.25.el6rt.x86_64 #1 HP > ProLiant DL580 G7 ... > static void try_to_wak

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
Hey, Steven. On Mon, Mar 18, 2013 at 12:23:19PM -0400, Steven Rostedt wrote: > > Maybe I'm confused but I can't really see how the above would be a > > problem to workqueue in itself. Both rq->lock and gcwq->lock are > > irq-safe, so spin_lock() not disabling preemption shouldn't be a > > problem

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
Hello, Steven. On Mon, Mar 18, 2013 at 12:30:43PM -0400, Steven Rostedt wrote: > If you happen to know the critical areas that require preemption to be > disabled for real, we can encapsulate them with: > > preempt_disable_rt(); > > preempt_enable_rt(); > > These are currently only

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
On Mon, Mar 18, 2013 at 12:41:23PM -0400, Steven Rostedt wrote: > But, I'm worried about the loops that are done while holding this lock. > Just looking at is_chained_work() that does for_each_busy_worker(), how > big can that list be? If it's bound by # of CPUs then that may be fine, > but if it c

Re: [PATCH REPOST v3.9-rc1] sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY

2013-03-18 Thread Tejun Heo
Hello, On Mon, Mar 18, 2013 at 10:41:40AM +0100, Ingo Molnar wrote: > > This patch replaces PF_THREAD_BOUND with PF_NO_SETAFFINITY. > > sched_setaffinity() checks the flag and return -EINVAL if set. > > set_cpus_allowed_ptr() is no longer affected by the flag. > > > > This will allow simplifying

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
On Mon, Mar 18, 2013 at 02:23:56PM -0400, Steven Rostedt wrote: > On Mon, 2013-03-18 at 09:43 -0700, Tejun Heo wrote: > > Hello, Steven. > > > > On Mon, Mar 18, 2013 at 12:30:43PM -0400, Steven Rostedt wrote: > > > If you happen to know the critical areas

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
On Mon, Mar 18, 2013 at 01:08:07PM -0400, Steven Rostedt wrote: > On Mon, 2013-03-18 at 09:43 -0700, Tejun Heo wrote: > > > Making gcwq locks disable preemption would be much safer / easier, but > > if that's not desirable, anything touching gcwq->idle_list would be

Re: workqueue code needing preemption disabled

2013-03-18 Thread Tejun Heo
On Mon, Mar 18, 2013 at 02:57:30PM -0400, Steven Rostedt wrote: > I like the theory, but it has one flaw. I agree that the update should > be wrapped in preempt_disable() but since this bug happens on the same > CPU, the state of the list will be the same when it was preempted to > when it bugged.

[PATCH] scheduler: convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s

2013-03-18 Thread Tejun Heo
getting unblocked. There's no reason to trigger BUG while holding rq lock crashing the whole system. Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s. Signed-off-by: Tejun Heo Cc: Steven Rostedt --- kernel/sched/core.c |6 -- 1 file changed, 4 insertions(+), 2 dele

[GIT PULL] workqueue fixes changes for 3.9-rc3

2013-03-18 Thread Tejun Heo
Hello, Linus. Please pull from the following branch to receive Lai's patch to fix highly unlikely but still possible workqueue stall during CPU hotunplug. git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-3.9-fixes Thanks.

Re: [PATCHSET wq/for-3.10] workqueue: misc cleanups

2013-03-18 Thread Tejun Heo
On Wed, Mar 13, 2013 at 07:54:01PM -0700, Tejun Heo wrote: > On Wed, Mar 13, 2013 at 04:58:12PM -0700, Tejun Heo wrote: > > This patch is on top of wq/for-3.10 e626761691 ("workqueue: implement > > current_is_workqueue_rescuer()"). > > Oops, forgot the git branch.

Re: [PATCHSET wq/for-3.10] workqueue: break up workqueue_lock into multiple locks

2013-03-18 Thread Tejun Heo
On Wed, Mar 13, 2013 at 07:57:18PM -0700, Tejun Heo wrote: > and available in the following git branch. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-finer-locking Applied to wq/for-3.10. Thanks. -- tejun -- To unsubscribe from this list: send the line "uns

Re: [PATCH 3/4] writeback: replace custom worker pool implementation with unbound workqueue

2013-03-18 Thread Tejun Heo
Hello, Jan. On Mon, Mar 18, 2013 at 11:32:44PM +0100, Jan Kara wrote: > I realized there may be one issue - so far we have a clear identification > which thread works for which bdi in the thread name (flush-x:y naming). > That was useful when debugging things. Now with your worker pool this is >

Re: [PATCH] sysfs: use atomic_inc_unless_negative in sysfs_get_active

2013-03-18 Thread Tejun Heo
return NULL; > > - > > - t = atomic_cmpxchg(&sd->s_active, v, v + 1); > > - if (likely(t == v)) > > - break; > > - if (t < 0) > > - return NULL; > > - > > -

Re: [PATCH 3/4] writeback: replace custom worker pool implementation with unbound workqueue

2013-03-19 Thread Tejun Heo
Hello, Jan. On Tue, Mar 19, 2013 at 8:46 AM, Jan Kara wrote: > Well, but what you often get is just output of sysrq-w, or sysrq-t, or > splat from scheduler about stuck task. You often don't have the comfort of > tracing... Can't we somehow change 'comm' of the task when it starts > processing

Re: [PATCH REPOST v3.9-rc1] sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY

2013-03-19 Thread Tejun Heo
On Mon, Mar 18, 2013 at 09:47:15AM -0700, Tejun Heo wrote: > Hello, > > On Mon, Mar 18, 2013 at 10:41:40AM +0100, Ingo Molnar wrote: > > > This patch replaces PF_THREAD_BOUND with PF_NO_SETAFFINITY. > > > sched_setaffinity() checks the flag and return -EINVAL if set.

Re: [PATCHSET wq/for-3.10] workqueue: simplify per-cpu worker rebinding and implement unbound worker CPU affinity restoration

2013-03-19 Thread Tejun Heo
On Thu, Mar 14, 2013 at 04:01:25PM -0700, Tejun Heo wrote: > 0001-sched-replace-PF_THREAD_BOUND-with-PF_NO_SETAFFINITY.patch > 0002-workqueue-convert-worker_pool-worker_ida-to-idr-and-.patch > 0003-workqueue-relocate-rebind_workers.patch > 0004-workqueue-directly-restore-CPU

[PATCH wq/for-3.10] workqueue: define workqueue_freezing static variable iff CONFIG_FREEZER

2013-03-19 Thread Tejun Heo
>From a0265a7f5161b6cb55e82b71edb236bbe0d9b3ae Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Tue, 19 Mar 2013 13:55:42 -0700 699ce097efe ("workqueue: implement and use pwq_adjust_max_active()") replaced the only workqueue_freezing usage outside freezer callbacks with a POOL_FREEZIN

Re: linux-next: manual merge of the workqueues tree with Linus' tree

2013-03-19 Thread Tejun Heo
Hey, Stephen. On Tue, Mar 19, 2013 at 01:19:38PM +1100, Stephen Rothwell wrote: > @@@ -456,40 -462,30 +462,30 @@@ static int worker_pool_assign_id(struc > { > int ret; > > - mutex_lock(&worker_pool_idr_mutex); > - ret = idr_alloc(&worker_pool_idr, pool, 0, 0, GFP_KERNEL); > -

Re: [PATCH v5 0/4] devcg: introduce proper hierarchy support

2013-03-19 Thread Tejun Heo
On Fri, Feb 15, 2013 at 11:55:43AM -0500, Aristeu Rozanski wrote: > This patchset implements device cgroup hierarchy. Exceptions will be > propagated down in the tree and local preferences will be re-evaluated > everytime a change in its parent occours, reapplying them if it's still > possible. Ap

Re: [PATCH v2] cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()

2013-03-19 Thread Tejun Heo
On Wed, Mar 13, 2013 at 09:17:09AM +0800, Li Zefan wrote: > These two functions share most of the code. > > Signed-off-by: Li Zefan Applied to cgroup/for-3.10. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vge

[PATCH cgroup/for-3.10] cgroup: make cgroup_mutex outer to threadgroup_lock

2013-03-19 Thread Tejun Heo
rsion. cgroup_mutex is no longer abused by controllers and can be put outer to threadgroup_lock. Reverse the locking order in attach_task_by_pid(). Signed-off-by: Tejun Heo Cc: Li Zefan --- Li, can you please ack this? Thanks! kernel/cgroup.c | 21 - 1 file changed, 8 inser

Re: linux-next: manual merge of the workqueues tree with Linus' tree

2013-03-19 Thread Tejun Heo
On Wed, Mar 20, 2013 at 09:05:40AM +1100, Stephen Rothwell wrote: > > Anyways, I pulled master into wq/for-next and resolved it there, so it > > shouldn't cause you any more trouble. > > Ah, OK, thanks. One small point, when you do a back merge like that, > you should always put an explanation i

[PATCH 02/10] workqueue: drop 'H' from kworker names of unbound worker pools

2013-03-19 Thread Tejun Heo
duled NUMA awareness support. Let's drop the non-essential 'H' postfix from unbound kworker name. While at it, restructure kthread_create*() invocation to help future NUMA related changes. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 15 --- 1 file chang

[PATCH 03/10] workqueue: determine NUMA node of workers accourding to the allowed cpumask

2013-03-19 Thread Tejun Heo
k contained inside single NUMA node, but this will serve as foundation for making all unbound pools NUMA-affine. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c inde

[PATCH 06/10] workqueue: move hot fields of workqueue_struct to the end

2013-03-19 Thread Tejun Heo
is implemented. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 151ce49..25dab9d 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -230,8 +230,6 @@ struct wq_d

[PATCH 08/10] workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq()

2013-03-19 Thread Tejun Heo
e any functional changes. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 75 +- 1 file changed, 52 insertions(+), 23 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 3f820a5..bbbfc92 100644 --- a/kernel/workqueue.c

[PATCH 07/10] workqueue: map an unbound workqueues to multiple per-node pool_workqueues

2013-03-19 Thread Tejun Heo
_workqueue as first_pwq() so this patch doesn't make any behavior changes. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 48 +--- 1 file changed, 37 insertions(+), 11 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 25dab9d..3f

[PATCH 10/10] workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity

2013-03-19 Thread Tejun Heo
#x27;t part of a pool's attributes. It only affects how apply_workqueue_attrs() picks which pools to use. After "pool_ids" change, first_pwq() doesn't have any user left. Removed. Signed-off-by: Tejun Heo --- Documentation/kernel-parameters.txt | 9 +++ include/l

[PATCH 09/10] workqueue: implement NUMA affinity for unbound workqueues

2013-03-19 Thread Tejun Heo
d from workqueue users by the number of concurrently active work items and this change shouldn't matter much. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 108 ++--- 1 file changed, 86 insertions(+), 22 deletions(-) diff --git a/kernel

[PATCHSET wq/for-3.10] workqueue: NUMA affinity for unbound workqueues

2013-03-19 Thread Tejun Heo
Hello, There are two types of workqueues - per-cpu and unbound. The former is bound to each CPU and the latter isn't not bound to any by default. While the recently added attrs support allows unbound workqueues to be confined to subset of CPUs, it still is quite cumbersome for applications where

[PATCH 01/10] workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]

2013-03-19 Thread Tejun Heo
. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 775c2f4..9b096e3 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -44,6 +44,7 @@ #include

[PATCH 04/10] workqueue: add workqueue->unbound_attrs

2013-03-19 Thread Tejun Heo
the above assumption will no longer hold. Introduce workqueue->unbound_attrs which records the current attrs in effect and use it for sysfs instead of first_pwq()->attrs. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 36 1 file changed, 24 insertions(

[PATCH 05/10] workqueue: make workqueue->name[] fixed len

2013-03-19 Thread Tejun Heo
Currently workqueue->name[] is of flexible length. We want to use the flexible field for something more useful and there isn't much benefit in allowing arbitrary name length anyway. Make it fixed len capping at 24 bytes. Signed-off-by: Tejun Heo --- kernel/workqueu

Re: [PATCHSET wq/for-3.10] workqueue: break up workqueue_lock into multiple locks

2013-03-20 Thread Tejun Heo
Hey, On Wed, Mar 20, 2013 at 11:01:50PM +0900, JoonSoo Kim wrote: > 2013/3/19 Tejun Heo : > > On Wed, Mar 13, 2013 at 07:57:18PM -0700, Tejun Heo wrote: > >> and available in the following git branch. > >> > >> git://git.kernel.org/pub/scm/linux/kernel/git/

Re: [PATCH 01/10] workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 11:08:29PM +0900, JoonSoo Kim wrote: > 2013/3/20 Tejun Heo : > > Unbound workqueues are going to be NUMA-affine. Add wq_numa_tbl_len > > and wq_numa_possible_cpumask[] in preparation. The former is the > > highest NUMA node ID + 1 and the latter is ma

Re: linux-next: build failure after merge of the final tree (cgroup tree related)

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:43:28PM +1100, Stephen Rothwell wrote: > It has returned today. Please be more careful. > > In file included from include/linux/memcontrol.h:22:0, > from include/linux/swap.h:8, > from include/linux/suspend.h:4, > from

Re: [PATCH cgroup/for-3.10] cgroup: make cgroup_mutex outer to threadgroup_lock

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 08:58:08AM +0800, Li Zefan wrote: > On 2013/3/20 6:02, Tejun Heo wrote: > > It doesn't make sense to nest cgroup_mutex inside threadgroup_lock > > when it should be outer to most all locks used by all cgroup > > controllers. It was nested in

Re: [PATCH 09/10] workqueue: implement NUMA affinity for unbound workqueues

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 11:03:53PM +0800, Lai Jiangshan wrote: > > +enomem: > > + free_workqueue_attrs(tmp_attrs); > > + if (pwq_tbl) { > > + for_each_node(node) > > + kfree(pwq_tbl[node]); > > It will free the dfl_pwq multi times. Oops, you're righ

Re: [PATCH 09/10] workqueue: implement NUMA affinity for unbound workqueues

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 8:26 AM, Lai Jiangshan wrote: >> for_eahc_node(node) >> if (pwq_tbl[node] != dfl_pwq) >> kfree(pwq_tbl[node]); >> kfree(dfl_pwq); > > I also missed. > we still need to put_unbound_pool() before free(pwq) Yeap, we do.

Re: [PATCH 01/10] workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 11:43:30PM +0800, Lai Jiangshan wrote: > > + for_each_node(node) > > + > > BUG_ON(!alloc_cpumask_var_node(&wq_numa_possible_cpumask[node], > > + GFP_KERNEL, node)); > > + for_each_possible_cpu(cpu) { > >

Re: [PATCH 08/10] workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq()

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 11:52:03PM +0800, Lai Jiangshan wrote: > > +static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct > > *wq, > > + const struct workqueue_attrs *attrs) > > +{ > > + struct worker_pool *pool; > > + struct pool

Re: [PATCH 00/21] workqueue: cleanups and better locking for recent changes

2013-03-20 Thread Tejun Heo
Hey, On Thu, Mar 21, 2013 at 12:38:17AM +0800, Lai Jiangshan wrote: > I am sorry for replying so late and replied with so huge patchset. > > But problem happened now, my patches and your patches are conflict. > Which patchset should be rebased? > > I think my patches need be merged at first. Thu

[PATCH v2 09/10] workqueue: implement NUMA affinity for unbound workqueues

2013-03-20 Thread Tejun Heo
d from workqueue users by the number of concurrently active work items and this change shouldn't matter much. v2: Fixed pwq freeing in apply_workqueue_attrs() error path. Spotted by Lai. Signed-off-by: Tejun Heo Cc: Lai Jiangshan --- kern

[PATCH 11/10] workqueue: use NUMA-aware allocation for pool_workqueues workqueues

2013-03-20 Thread Tejun Heo
shan. Signed-off-by: Tejun Heo Cc: Lai Jiangshan --- kernel/workqueue.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3683,12 +3683,14 @@ static void pwq_adjust_max_active(struct spin_unlock(&pwq->pool->loc

Re: [PATCH 01/21] workqueue: add missing POOL_FREEZING

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:01AM +0800, Lai Jiangshan wrote: > When we create a new pool via get_unbound_pool() when freezing. > the pool->flags' POOL_FREEZING is incorrect. > > Fix it by adding POOL_FREEZING if workqueue_freezing. > (wq_mutex is already held for accessing workqueue_freezing.) >

[PATCH 01/21] workqueue: add missing POOL_FREEZING

2013-03-20 Thread Tejun Heo
ing POOL_FREEZING if workqueue_freezing. wq_mutex is already held so no further locking is necessary. This also removes the unused static variable warning when !CONFIG_FREEZER. tj: Updated commit message. Signed-off-by: Lai Jiangshan Signed-off-by: Tejun Heo --- kernel/workqueue.c | 3 +++ 1 f

Re: [PATCH 02/21] workqueue: don't free pool->worker_idr by RCU

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:02AM +0800, Lai Jiangshan wrote: > pool->worker_idr nor worker->id is not protected by RCU. > don't need to free pool->worker_idr by RCU. > > Just free it directly. > > Signed-off-by: Lai Jiangshan ... > @@ -3462,6 +3461,7 @@ static void put_unbound_pool(struct work

Re: [PATCH 03/21] workqueue: simplify current_is_workqueue_rescuer()

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:03AM +0800, Lai Jiangshan wrote: > current_is_workqueue_rescuer() <-> current is worker and worker->rescue_wq > > Signed-off-by: Lai Jiangshan Applied to wq/for-3.10. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body o

Re: [PATCH 04/21] workqueue: swap the two branches in pwq_adjust_max_active() to get better readability

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:04AM +0800, Lai Jiangshan wrote: > "if (!freezable || !(pwq->pool->flags & POOL_FREEZING))" is hard to read. > > Swap the two branches. it becomes > "if (freezable && (pwq->pool->flags & POOL_FREEZING))", it is better. > > Signed-off-by: Lai Jiangshan > --- > kerne

[PATCH] workqueue: kick a worker in pwq_adjust_max_active()

2013-03-20 Thread Tejun Heo
both of the above two cases. This also makes thaw_workqueues() simpler. tj: Updated comments and description. Signed-off-by: Lai Jiangshan Signed-off-by: Tejun Heo --- kernel/workqueue.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/kernel/workqueue.c

Re: [PATCH 06/21] workqueue: separate out pools locking into pools_mutex

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:06AM +0800, Lai Jiangshan wrote: > currently wq_mutext protects: > > * worker_pool_idr and unbound_pool_hash > * pool->refcnt > * workqueues list > * workqueue->flags, ->nr_drainers > * workqueue_freezing > > We can see that it protects very different things. > So we

Re: [PATCH 10/21] workqueue: use rcu_read_lock_sched() instead for accessing pwq in RCU

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:10AM +0800, Lai Jiangshan wrote: > rcu_read_lock_sched() is better than preempt_disable() if the code is > protected by RCU_SCHED. > > Signed-off-by: Lai Jiangshan Applied to wq/for-3.10. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscri

Re: [PATCH 15/21] workqueue: remove worker_maybe_bind_and_lock()

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:15AM +0800, Lai Jiangshan wrote: > static struct worker *alloc_worker(void) > { > struct worker *worker; > @@ -2326,7 +2262,8 @@ repeat: > spin_unlock_irq(&wq_mayday_lock); > > /* migrate to the target cpu if possible */ > -

Re: [PATCH 17/21] workqueue: simplify workqueue_cpu_up_callback(CPU_ONLINE)

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:17AM +0800, Lai Jiangshan wrote: > If we have 4096 CPUs, workqueue_cpu_up_callback() will travel too much CPUs, > to avoid it, we use for_each_cpu_worker_pool for the cpu pools and > use unbound_pool_hash for unbound pools. > > After it, for_each_pool() becomes unused

Re: [PATCH 18/21] workqueue: read POOL_DISASSOCIATED bit under pool->lock

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:18AM +0800, Lai Jiangshan wrote: > Simply move it to pool->lock C.S. Patch description should explain *why* in addition to what. Also, please don't use abbreviations like C.S. Yeah, I know it's critical section but it might as well be counter strike or context switc

Re: [PATCH 21/21] workqueue: avoid false negative in assert_manager_or_pool_lock()

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 03:28:21AM +0800, Lai Jiangshan wrote: > If lockdep complains something for other subsystem, lockdep_is_held() can be > false negative. so we need to also test debug_locks before do assert. > > Signed-off-by: Lai Jiangshan Applied to wq/for-3.10. Thanks. -- tejun -- To

Re: [PATCH 00/21] workqueue: cleanups and better locking for recent changes

2013-03-20 Thread Tejun Heo
So, overall, On Wed, Mar 20, 2013 at 03:28:00AM +0800, Lai Jiangshan wrote: ... > In this list, we can find that: > 1) wq_mutex protects too much different kind of things. I don't agree with this and unless you can come up with a much better reason, won't be splitting wq_mutex further. Also, I'm

Re: [PATCH cgroup/for-3.10] cgroup: make cgroup_mutex outer to threadgroup_lock

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 11:35 AM, Oleg Nesterov wrote: > then we need "do not abuse ->cred_guard_mutex in threadgroup_lock()" > acked by you and Li. Please let me know if I should resend it. Yeah, we want that one regardless of this one. Please feel free to add my Acked-by (if I hadn't acked alre

[PATCH v2 UPDATED 09/10] workqueue: implement NUMA affinity for unbound workqueues

2013-03-20 Thread Tejun Heo
d from workqueue users by the number of concurrently active work items and this change shouldn't matter much. v2: Fixed pwq freeing in apply_workqueue_attrs() error path. Spotted by Lai. Signed-off-by: Tejun Heo Cc: Lai Jiangshan --- Please forget about the previous posting. It was fre

Re: [PATCHSET wq/for-3.10] workqueue: NUMA affinity for unbound workqueues

2013-03-20 Thread Tejun Heo
On Tue, Mar 19, 2013 at 05:00:19PM -0700, Tejun Heo wrote: > and also available in the following git branch. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-numa Branch rebased on top of the current wq/for-3.10 with updated patches. The new co

Re: linux-next: build failure after merge of the final tree (cgroup tree related)

2013-03-20 Thread Tejun Heo
On Wed, Mar 20, 2013 at 3:09 PM, Stephen Rothwell wrote: > There has been no change to the cgroup tree > (git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git#for-next). > Forgot to push? Yeah, I'm on a roll today. My apologies. Just pushed out. Thanks. -- tejun -- To unsubscribe from t

Re: [PATCH V3 3/7] workqueue: Add helpers to schedule work on any cpu

2013-03-20 Thread Tejun Heo
Hello, Viresh. On Mon, Mar 18, 2013 at 08:53:25PM +0530, Viresh Kumar wrote: > queue_work() queues work on current cpu. This may wake up an idle CPU, which > is > actually not required. > > Some of these works can be processed by any CPU and so we must select a > non-idle > CPU here. The initia

Re: [PATCH 3/4] writeback: replace custom worker pool implementation with unbound workqueue

2013-03-20 Thread Tejun Heo
Hello, Dave. On Thu, Mar 21, 2013 at 12:57:21PM +1100, Dave Chinner wrote: > When you have a system that has 50+ active filesystems (pretty > common in the distributed storage environments were every disk has > it's own filesystem), knowing which filesystem(s) are getting stuck > in writeback from

[PATCH] idr: fix a subtle bug in idr_get_next()

2013-02-02 Thread Tejun Heo
igned offset - ie. use round_up(id + 1, slot_distance) instead of id += slot_distance. Signed-off-by: Tejun Heo Reported-by: David Teigland Cc: KAMEZAWA Hiroyuki --- lib/idr.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/lib/idr.c b/lib/idr.c index 6482390..ca

Re: [PATCH] idr: fix a subtle bug in idr_get_next()

2013-02-02 Thread Tejun Heo
On Sat, Feb 02, 2013 at 03:10:48PM -0800, Tejun Heo wrote: > Fix it by ensuring proceeding to the next slot doesn't carry over the > unaligned offset - ie. use round_up(id + 1, slot_distance) instead of > id += slot_distance. > > Signed-off-by: Tejun Heo > Reported-b

[PATCHSET] idr: implement idr_alloc() and convert existing users

2013-02-02 Thread Tejun Heo
Hello, * Andrew, I think this one is best routed through -mm together with the previous series. Please read on. * Bruce, I couldn't convert nfsd. Can you please help? More on it later. * Stanislav, Eric, James, can you please take a look at ipc/util.c conversion? I moved allocation ins

[PATCH 04/62] idr: refactor idr_get_new_above()

2013-02-02 Thread Tejun Heo
Move slot filling to idr_fill_slot() from idr_get_new_above_int() and make idr_get_new_above() directly call it. idr_get_new_above_int() is no longer needed and removed. This will be used to implement a new ID allocation interface. Signed-off-by: Tejun Heo --- lib/idr.c | 30

[PATCH 06/62] block: fix synchronization and limit check in blk_alloc_devt()

2013-02-02 Thread Tejun Heo
idr allocation in blk_alloc_devt() wasn't synchronized against lookup and removal, and its limit check was off by one - 1 << MINORBITS is the number of minors allowed, not the maximum allowed minor. Add locking and rename MAX_EXT_DEVT to NR_EXT_DEVT and fix limit checking. Signed-of

[PATCH 15/62] drm: convert to idr_alloc()

2013-02-02 Thread Tejun Heo
Convert to the much saner new idr interface. Only compile tested. * drm_ctxbitmap_next() error handling in drm_addctx() seems broken. drm_ctxbitmap_next() return -errno on failure not -1. Signed-off-by: Tejun Heo Cc: David Airlie Cc: dri-de...@lists.freedesktop.org --- This patch depends on

[PATCH 22/62] infiniband/core: convert to idr_alloc()

2013-02-02 Thread Tejun Heo
Convert to the much saner new idr interface. Only compile tested. Signed-off-by: Tejun Heo Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock Cc: linux-r...@vger.kernel.org --- This patch depends on an earlier idr changes and I think it would be best to route these together through -mm

[PATCH 28/62] infiniband/mlx4: convert to idr_alloc()

2013-02-02 Thread Tejun Heo
Convert to the much saner new idr interface. Only compile tested. Signed-off-by: Tejun Heo Cc: Jack Morgenstein Cc: Or Gerlitz Cc: Roland Dreier Cc: linux-r...@vger.kernel.org --- This patch depends on an earlier idr changes and I think it would be best to route these together through -mm

[PATCH 33/62] mfd: convert to idr_alloc()

2013-02-02 Thread Tejun Heo
Convert to the much saner new idr interface. Only compile tested. Signed-off-by: Tejun Heo Cc: Samuel Ortiz --- This patch depends on an earlier idr changes and I think it would be best to route these together through -mm. Please holler if there's any objection. Thanks. driver

[PATCH 41/62] pps: convert to idr_alloc()

2013-02-02 Thread Tejun Heo
Convert to the much saner new idr interface. Only compile tested. Signed-off-by: Tejun Heo Cc: Rodolfo Giometti --- This patch depends on an earlier idr changes and I think it would be best to route these together through -mm. Please holler if there's any objection. Thanks. driver

  1   2   3   4   5   6   7   8   9   10   >