On Fri, May 29, 2020 at 10:14 PM Tejun Heo <t...@kernel.org> wrote: > > On Fri, May 29, 2020 at 06:59:00AM +0000, Lai Jiangshan wrote: > > Now rescuer checks pwq->nr_active before requeues the pwq, > > it is a more robust check and the rescuer must be still valid. > > > > Signed-off-by: Lai Jiangshan <la...@linux.alibaba.com> > > --- > > kernel/workqueue.c | 23 +++++++++-------------- > > 1 file changed, 9 insertions(+), 14 deletions(-) > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > index b2b15f1f0c8d..8d017727bfbc 100644 > > --- a/kernel/workqueue.c > > +++ b/kernel/workqueue.c > > @@ -248,7 +248,7 @@ struct workqueue_struct { > > struct list_head flusher_overflow; /* WQ: flush overflow list > > */ > > > > struct list_head maydays; /* MD: pwqs requesting rescue > > */ > > - struct worker *rescuer; /* MD: rescue worker */ > > + struct worker *rescuer; /* I: rescue worker */ > > > > int nr_drainers; /* WQ: drain in progress */ > > int saved_max_active; /* WQ: saved pwq max_active > > */ > > @@ -2532,12 +2532,13 @@ static int rescuer_thread(void *__rescuer) > > if (pwq->nr_active && need_to_create_worker(pool)) { > > spin_lock(&wq_mayday_lock); > > /* > > - * Queue iff we aren't racing destruction > > - * and somebody else hasn't queued it already. > > + * Queue iff somebody else hasn't queued it > > + * already. > > */ > > - if (wq->rescuer && > > list_empty(&pwq->mayday_node)) { > > + if (list_empty(&pwq->mayday_node)) { > > get_pwq(pwq); > > - list_add_tail(&pwq->mayday_node, > > &wq->maydays); > > + list_add_tail(&pwq->mayday_node, > > + &wq->maydays); > > send_mayday() also checks for wq->rescuer, so when sanity check fails, > scenarios which would have leaked a workqueue after destroying its rescuer > can lead to use-after-free after the patch. I'm not quite sure why the patch > is an improvement. >
Hi I'm not sure I understood your words. And I'm not sure which function may use freed object in "use-after-free". Is it "send_mayday() may use a freed rescuer"? This patch relies on def98c84b6 ("workqueue: Fix spurious sanity check failures in destroy_workqueue()") to move the kthread_stop() before the sanity check and the work of drain_workqueue() which guarantees there is no work item in the workqueue. If send_mayday() still goes wrong after drain_workqueue(), the user must have queued work items and invoked destroy_workqueue() concurrently. It is excellent if the sanity check can find this case out, but it is not possible that the sanity check can always live through it since it is not worqueue's internal fault. We hope the sanity check can find all the internal fault, but not to the extend that it can always work when any user uses it in a very wrong way. Thanks Lai.