On Mon, Mar 29, 2010 at 02:27:34PM -0400, John Baldwin wrote: > On Monday 29 March 2010 1:30:38 pm Jeremy Chadwick wrote: > > On Mon, Mar 29, 2010 at 05:01:02PM +0000, Masoom Shaikh wrote: > > > On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras <ivo...@freebsd.org> wrote: > > > > On 28 March 2010 16:42, Masoom Shaikh <masoom.sha...@gmail.com> wrote: > > > > > > > >> lets assume if this is h/w problem, then how can other OSes overcome > > > >> this ? is there a way to make FreeBSD ignore this as well, let it > > > >> result in reasonable performance penalty. > > > > > > > > Very probably, if only we could detect where the problem is. > > > > Try adding "options PRINTF_BUFR_SIZE=128" to the kernel > > > > > > this option is already there > > > > The key word in Ivan's phrase is "less mangled". Neither use of or > > increasing PRINTF_BUFR_SIZE solves the problem of interspersed console > > output. I've been ranting/raving about this problem for years now; it > > truly looks like a mutex lock issue (or lack of such lock), but I've > > been told numerous times that isn't the case. > > > > To developers: what incentives would help get this issue well-needed > > attention? This problem makes kernel debugging, panic analysis, and > > other console-oriented viewing basically impossible. > > I was recently going to look at it. The somewhat drastic approach I was > going > to take was to add a simple serializing lock around trap_fatal() and a few > other places that do similar block prints (e.g. mca_log()). One of the > issues > with fixing this in printf itself is that you'd want probably want to > serialize complete lines of text on a per-thread basis. You would want to be > able to accumulate this line of text across multiple calls to printf (think > of > it as line-buffering ala stdio). However, some folks may be nervous about > printf not printing things immediately. > > The other issue is that lots of code assumes it can call printf from anywhere > and everywhere. Mostly this just means that if you add locking and line- > buffering to printf(9) you have to be very careful to make sure it works in > odd places. Probably a lot of this could be solved by deferring things like > trap_fatal() until panic() has already been called (which is bde's preferred > solution I think).
John, Thanks for the insights, they're greatly appreciated. I went looking this morning to see how Linux addressed this issue (if at all), and it's been discussed a few times in the past. The longest lkml thread I could find that mentioned the problem was circa 2002. Probably not worth reading as there was work done in 2009 to solve the issue. http://lkml.indiana.edu/hypermail/linux/kernel/0204.1/index.html#161 Work done by RedHat in 2009 details how they implemented a lockless version of their kernel ring buffer (similar to our system message buffer, but probably a lot more complex): http://lwn.net/Articles/340400/ http://lwn.net/Articles/340443/ Supposedly having multiple writers to the ring is 100% safe; no interspersed output. Same goes for interrupt-generated stuff. There's some comments in the technical document (2nd link) that imply there's an individual ring buffer for each CPU; possibly per-CPU kernel message buffers would solve our issue? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"