Robert Redelmeier writes:
> Daniel Phillips wrote in part:

>> One thing to keep in mind in all of this is: nobody is testing the
>> reliability of their journalling or any other kind of filesystem just by
>> running it.  To test these things you have to crash/interrupt the system
>> *a lot*.  Otherwise, you are just fooling yourself and everybody else.
>> How many crashes does it take to find that one little window of
>> vulnerability that comes up every 10,000 crashes normally but suddenly
>> starts coming up every time just because your customer uses their system
>> a different way?  You're doing the right thing by crash-testing it, now
>> instead of doing it 5 times do it 1,000 times.  Here's one of my
>> favorite tests: unzip a kernel source tree and wait until the disk light
>> goes out.  A second or so after it comes on again (kflushd) hit the
>> reset button.
>
> Good idea.  I certainly believe in stressing hardware (see .sig),
> but I'm not sure this test is rigorous enough.  The problem is 
> the reset button is only connected to the CPU and the hard disk
> will probably continue to write out sectors from it's hw buffer.
> OTOH, I don't like the idea of pulling the plug too often.  It's
> very hard on the hardware.  I'd expect a mechanical disk failure
> before 10,000 cycles.

The nice way to develop this code is with a block device that
discards all writes after a timer goes off.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to