On Wed, 2007-09-12 at 02:16 -0700, Andrew Morton wrote: > On Tue, 11 Sep 2007 23:49:47 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > > > On Tue, 2007-09-11 at 21:52 +0200, Thomas Gleixner wrote: > > > > C1: type[C1] promotion[C2] demotion[--] > > > > latency[001] usage[00000010] duration[00000000000000000000] > > > > *C2: type[C2] promotion[--] demotion[C1] > > > > latency[001] usage[00008316] duration[00000000000170717293] > > > > > > Ok, here we are. The bad one uses C2 which stops the local apic on the > > > VAIO. I suspect we end up in the suspend/resume with going into C2 > > > without the broadcast active. > > > > > > Can you try to get the output of SysRq-Q during the "it needs help from > > > keyboard" period ? > > > > Summary of the oddities we are seing: > > > > 1.) disabling local apic timer makes the problem go away > > 2.) disabling nohz and highres makes the problem go away > > 3.) adding the cpuidle patches from the acpi tree makes the problem go > > away > > Only the fist cpuidle patch, actually. I'd have though this was a big hint?
Yes, it prevents the cpu to enter C2, which keeps the local APIC alive. > > The obvious conclusion is, that in all other cases we run into a state, > > where the local apic timer is not working. > > > > 1) we do not use it > > 2) it is used in periodic mode > > 3) the cpu does not enter C2 (which turns the lapic timer off on the > > VAIO) > > > > While 1) and 3) are understandable, the reason why 2) is working is a > > mystery because the periodic mode is affected by the C state as well. > > > > Andrew, can you please provide the output of /proc/timer_list when you > > boot the kernel with "nohz=off highres=off", but honestly I do not > > expect a lot of enlightenment from it. > Tick Device: mode: 0 > Clock Event Device: pit > max_delta_ns: 27461866 > min_delta_ns: 12571 > mult: 5124677 > shift: 32 > mode: 2 > next_event: 9223372036854775807 nsecs > set_next_event: pit_next_event > set_mode: init_pit_timer > event_handler: tick_handle_periodic_broadcast > tick_broadcast_mask: 00000001 > tick_broadcast_oneshot_mask: 00000000 Ok, I got my brain together and read through the code. The periodic mode switches to broadcast right away when the acpi code discovers that there is a lapic stops C-State available. Does the test hack below fix the problem for nohz/highres enabled kernels ? tglx --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -382,6 +382,8 @@ static int tick_broadcast_set_event(ktime_t expires, int force) int tick_resume_broadcast_oneshot(struct clock_event_device *bc) { + cpu_set(smp_processor_id(), tick_broadcast_oneshot_mask); + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); if(!cpus_empty(tick_broadcast_oneshot_mask)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/