On Wed, Dec 07, 2016 at 05:53:47PM +0000, Mark Rutland wrote:
> On Wed, Dec 07, 2016 at 01:52:17PM +0000, Mark Rutland wrote:
> > Hi all
> > 
> > Jeremy noticed a kernel lockup on arm64 when the perf tool was used in
> > parallel with hotplug, which I've reproduced on arm64 and x86(-64) with
> > v4.9-rc8. In both cases I'm using defconfig; I've tried enabling lockdep
> > but it was silent for arm64 and x86.
> 
> It looks like we're trying to install a task-bound event into a context
> where task_cpu(ctx->task) is dead, and thus the cpu_function_call() in
> perf_install_in_context() fails. We retry repeatedly.
> 
> On !PREEMPT (as with x86 defconfig), we manage to prevent the hotplug
> machinery from making progress, and this turns into a livelock.
> 
> On PREEMPT (as with arm64 defconfig), I'm somewhat lost.

So the problem is that even with PREEMPT we can hit a blocked task
that has a 'dead' cpu.

We'll spin until either the task wakes up or the CPU does, either can
take a very long time.

How exactly your test-case triggers this, all it executes is 'true' and
that really shouldn't block much, is a mystery still.

In any case, the below cures things, but it is, as the comment says,
horrific.

I'm yet to come up with a better patch that doesn't violate scheduler
internals.

---
 kernel/events/core.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6ee1febdf6ff..6faf4b03396e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1203,6 +1203,11 @@ static void put_ctx(struct perf_event_context *ctx)
  *           perf_event_context::lock
  *         perf_event::mmap_mutex
  *         mmap_sem
+ *
+ *    task_struct::pi_lock
+ *      rq::lock
+ *        perf_event_context::lock
+ *
  */
 static struct perf_event_context *
 perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
@@ -2352,6 +2357,28 @@ perf_install_in_context(struct perf_event_context *ctx,
                return;
        }
        raw_spin_unlock_irq(&ctx->lock);
+
+       raw_spin_lock_irq(&task->pi_lock);
+       if (!(task->state == TASK_RUNNING || task->state == TASK_WAKING)) {
+               /*
+                * XXX horrific hack...
+                */
+               raw_spin_lock(&ctx->lock);
+               if (task != ctx->task) {
+                       raw_spin_unlock(&ctx->lock);
+                       raw_spin_unlock_irq(&task->pi_lock);
+                       goto again;
+               }
+
+               add_event_to_ctx(event, ctx);
+               raw_spin_unlock(&ctx->lock);
+               raw_spin_unlock_irq(&task->pi_lock);
+               return;
+       }
+       raw_spin_unlock_irq(&task->pi_lock);
+
+       cond_resched();
+
        /*
         * Since !ctx->is_active doesn't mean anything, we must IPI
         * unconditionally.

Reply via email to