On Thu, 2010-06-24 at 11:17 +0100, Ian Jackson wrote: > Ben Hutchings writes ("Re: Bug#584881: Lockups under heavy disk IO; md (RAID) > resync/check implicated"): [...] > > > Search the web suggests that symptoms very similar to mine are not > > > uncommon, including instances without soft lockup messages, and none > > > of the other users seem to have a similar disk layout. > > > > > > I can't easily test this theory but I think the unusual disk layout is > > > probably simply making a race easier to trigger. > > > > Thinking of some kind of lock-dependency bug? Blocking on a mutex for a > > long period should still trigger a soft-lockup message. Since there are > > no messages from the kernel it's something of a mystery what's going on. > > The RAID system (md driver) has a separate mechanism for blocking > writes, which it calls a "barrier". I think it is quite capable of > indefinitely blocking all writes to a device without necessarily > triggering the soft lockup detector. [...]
I/O barriers are block I/O operations (not specific to md) that inhibit reordering of read and write operations. They certainly should not be blocking operations. Also, device-mapper did not support barriers until after 2.6.26 so md will not be using them in the configuration you are using. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse.
signature.asc
Description: This is a digitally signed message part