try_to_grab_pending() leave LINKED tagalong in delayed queue when
it deletes a work. This behavior will cause future
cwq_activate_first_delayed() increase the ->nr_active wrongly,
and may cause the whole cwq frozen.

example:

state:  cwq->max_active = 1, cwq->nr_active = 1
        one work in cwq->pool, many in cwq->delayed_works.

step1:  try_to_grab_pending() remove a work from delayed_works but leave 
tagalong.
step2:  when the work in cwq->pool is finished, cwq_activate_first_delayed()
        move the tagalong to cwq->pool and increase the ->nr_active.

current state: cwq->nr_active = 1, but works of the cwq in cwq->pool are all 
NO_COLOR,
        so even when these works are finished, cwq->nr_active will not be 
decreased,
        and no work will be moved from cwq->delayed_works. the cwq is frozen.

Fix it by moving the tagalong to cwq->pool in try_to_grab_pending().

Signed-off-by: Lai Jiangshan <la...@cn.fujitsu.com>
---
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7b91332..1a268b2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1101,11 +1101,26 @@ static int try_to_grab_pending(struct work_struct 
*work, bool is_dwork,
                 */
                smp_rmb();
                if (gcwq == get_work_gcwq(work)) {
+                       unsigned long delayed;
+
+                       /*
+                        * move LINKED tagalong(if exist) out from
+                        * delayed queue. Otherwise some future
+                        * cwq_activate_first_delayed() may move
+                        * works(which are all NO_COLOR) to cwq->pool
+                        * and increase the ->nr_active. It may
+                        * cause the whole cwq frozen.
+                        */
+                       delayed = *work_data_bits(work) & WORK_STRUCT_DELAYED;
+                       if (delayed)
+                               move_linked_works(work,
+                                       &get_work_cwq(work)->pool->worklist,
+                                       NULL);
+
                        debug_work_deactivate(work);
                        list_del_init(&work->entry);
                        cwq_dec_nr_in_flight(get_work_cwq(work),
-                               get_work_color(work),
-                               *work_data_bits(work) & WORK_STRUCT_DELAYED);
+                               get_work_color(work), delayed);
 
                        spin_unlock(&gcwq->lock);
                        return 1;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to