Chip Camden <sterl...@camdensoftware.com> wrote in <20110818025550.ga1...@libertas.local.camdensoftware.com>:
st> Quoth Attilio Rao on Thursday, 18 August 2011: st> > In callout_cpu_switch() if a low priority thread is migrating the st> > callout and gets preempted after the outcoming cpu queue lock is left st> > (and scheduled much later) we get this problem. st> > st> > In order to fix this bug it could be enough to use a critical section, st> > but I think this should be really interrupt safe, thus I'd wrap them st> > up with spinlock_enter()/spinlock_exit(). Fortunately st> > callout_cpu_switch() should be called rarely and also we already do st> > expensive locking operations in callout, thus we should not have st> > problem performance-wise. st> > st> > Can the guys I also CC'ed here try the following patch, with all the st> > initial kernel options that were leading you to the deadlock? (thus st> > revert any debugging patch/option you added for the moment): st> > http://www.freebsd.org/~attilio/callout-fixup.diff st> > st> > Please note that this patch is for STABLE_8, if you can confirm the st> > good result I'll commit to -CURRENT and then backmarge as soon as st> > possible. st> > st> > Thanks, st> > Attilio st> > st> st> Thanks, Attilio. I've applied the patch and removed the extra debug st> options I had added (though keeping debug symbols). I'll let you know if st> I experience any more panics. No panic for 20 hours at this moment, FYI. For my NFS server, I think another 24 hours would be sufficient to confirm the stability. I will see how it works... -- Hiroki
pgpatVE0r5wVx.pgp
Description: PGP signature