On Tue, Apr 24 2007, Roland Kuhn wrote: > Hi Jens! > > On 24 Apr 2007, at 14:32, Jens Axboe wrote: > > >On Tue, Apr 24 2007, Roland Kuhn wrote: > >>Hi Jens! > >> > >>[I made a typo in the Cc: list so that lkml is only included as of > >>now. Actually I copied the typo from you ;-) ] > > > >Well no, you started the typo, I merely propagated it and forgot to > >fix > >it up :-) > > > Actually, I copied it from your printk() ;-) (thinking helps...)
Ahhh! Yeah that one indeed has a typo, tsk tsk. > >>>>>Sure. You might want to include NFS file access into your tests, > >>>>>since we've not triggered this with locally accessing the disks. > >>>>>BTW: > >>>> > >>>>How are you exporting the directory (what exports options) - how > >>>>is it > >>>>mounted by the client(s)? What chunksize is your raid6 using? > >>> > >>>And what are the nature of the files on the raid (huge, small, ?) > >>>and > >>>what are the client(s) doing? Just approximately, I know these > >>>things > >>>can be hard/difficult/impossible to specify. > >>> > >>The files are 100-400MB in size and the client is merging them into a > >>new file in the same directory using the ROOT library, which does in > >>essence alternating sequences of > >> > >>_llseek(somewhere) > >>read(n bytes) > >>_llseek(somewhere+n) > >>read(m bytes) > >>... > >> > >>and then > >> > >>_llseek(somewhere) > >>rt_sigaction(ignore INT) > >>write(n bytes) > >>rt_sigaction(INT->DFL) > >>time() > >>_llseek(somewhere+n) > >>... > >> > >>where n is of the the order of 30kB. The input files are treated > >>sequentially, not randomly. > > > >Ok, I'll see if I can reproduce it. No luck so far, I'm afraid. > > > Too bad. > > >>BTW: the machine just stopped dead, no sign whatsoever on console or > >>netconsole, so I rebooted with elevator=deadline > >>(need to get some work done besides ;-) ) > > > >Unfortunately expected, if we can race and lose an update to - > >>next_rq, > >we can race and corrupt some of the internal data structures as > >well. If > >you have the time and inclination, it would be interesting to see > >if you > >can reproduce with some debugging options enabled: > > > >- Enable all preempt, spinlock and lockdep debugging measures > >- Possibly slab poisoning, although that may slow you down somewhat > > > Kernel compilation under way, will report back. Thanks! > >Are you using 4kb stacks? > > > No idea, 'grep -i stack .config' gives no indication, but ISTR that > 4k was made the default some time back? You are on x86-64, my mistake. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/