On 08/30/2013 04:04 PM, Gerlando Falauto wrote: > Hi, > > sorry, it took me a while to narrow it down... > > On 08/30/2013 01:45 AM, John Stultz wrote: >> On 08/29/2013 01:56 PM, Falauto, Gerlando wrote: >>> Hi everyone, >>> >>> I ran into the deadlock situation reported at the bottom. >>> Actually, on my latest 3.10 kernel for some reason I don't get the >>> report (the kernel just hangs for some reason), so it took me quite >>> some >>> time to track it down. >>> >>> Once I figured the trigger to the machine hanging was adjtimex(), I >>> reverted everything (between 3.9 to 3.10) that was touching >>> kernel/time/timekeeping/timekeeping.c and kernel/time/ntp.c, I double >>> checked that indeed the problem was not happening anymore, and finally >>> started bisecting, landing on the following offending commit. >>> THEN, and ONLY THEN, did I get the &%""รง+"% deadlock report. >>> >>> Do you guys have any ideas what could be wrong and how to fix it? >> >> Thanks for the report! >> >> What exactly is your process for reproducing the issue? > > Now (well, now...), it's quite easy. > > Three ingredients: > > 1) Kernel 3.10 > > 2) Enable HRTICK > > diff --git a/kernel/sched/features.h b/kernel/sched/features.h > index 99399f8..294e3ca 100644 > --- a/kernel/sched/features.h > +++ b/kernel/sched/features.h > @@ -41,7 +41,7 @@ SCHED_FEAT(WAKEUP_PREEMPTION, true) > */ > SCHED_FEAT(ARCH_POWER, true) > > -SCHED_FEAT(HRTICK, false) > +SCHED_FEAT(HRTICK, true) > SCHED_FEAT(DOUBLE_TICK, false) > SCHED_FEAT(LB_BIAS, true) > > 3) Run the following: > > #include <stdio.h> > #include <sys/timex.h> > > int main(void) > { > int i; > > for (i = 0 ; ; i++) { > struct timex adj = {}; > printf("%d\r", i); > fflush(stdout); > adjtimex(&adj); > } > return 0; > } > > Notice how: > 1) The original issue (with a bit more complicated scenario) was seen > on ARM and PowerPC platforms > 2) Under the above test conditions (on ARM) I *don't* get any deadlock > report printed, the machine just hangs > 3) The offending commit (below) I had found through a weird (manual) > process of reverting and re-reverting (where some commits could have > been reverted out of order), so I'm not 100% sure you'd come to the > same conclusions. > > commit 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 > Author: John Stultz <john.stu...@linaro.org> > Date: Fri Mar 22 11:37:28 2013 -0700 > > timekeeping: Hold timekeepering locks in do_adjtimex and hardpps > > I'm not able to perform any further testing at this very moment, but > if needed, I can try bisecting again sometime next week, so to make an > even more reliable statement. >
Thanks so much for the details! I'll take a shot at reproducing this and will let you know what comes of it. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/