On Sat, 16 Sep 2017, Fengguang Wu wrote:
> > > [    0.038086] Performance Events: unsupported p6 CPU model 61 no PMU
> > > driver, software events only.
> 
> What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy
> Bridge machines with the attached script.

My bad. I booted the wrong config ....

> > > [    0.041031] Hierarchical SRCU implementation.
> > > [    0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2
> > > [    0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled
> > > 
> > > Confused
> > 
> > I still can't reproduce. Can you please apply the debug patch below and
> > provide the output?
> 
> OK. I'll try and report back tomorrow.

Don't bother. I found it already. On UP we have:

#define for_each_cpu(cpu, mask)               \
        for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)

which is a total fail as it breaks any code which uses for_each_cpu() or
any of the other variants on UP by assuming that all cpumask have bit 0
set.

That means any code which does not have conditional code for some of the
cpumask functions is potentially broken. Sigh.

The simple cure for the watchdog is below.

Thanks,

        tglx
8<------------------

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index b2931154b5f2..d4c0f75b189e 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -221,7 +221,12 @@ void hardlockup_detector_perf_cleanup(void)
                struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
                per_cpu(watchdog_ev, cpu) = NULL;
-               perf_event_release_kernel(event);
+               /*
+                * Check the event, because on UP for_each_cpu() assumes
+                * idiotically that all masks handed in have bit 0 set.
+                */
+               if (event)
+                       perf_event_release_kernel(event);
        }
        cpumask_clear(&dead_events_mask);
 }









Reply via email to