L. V. Lammert wrote: > At 08:31 PM 6/16/2005 +0100, Niall O'Higgins wrote: >>Controllers don't tend to like it. Sometimes with disk failure, the >>controller will fail too! > > The ASUS A7V880 runs just fine with one disk dead - infant mortality a few > months ago. > > Lee
One example does not make it always so. Some people expect RAID (of either HW or SW kind) to keep them running through a disk failure... Some have more experience. Designing systems that work through failures is not trivial. The way devices fail in the real world is very different than the way you expect them to fail, and rarely can you get a device to fail while you are watching everything you need to to watch to fix a problem once discovered. If you do get a real-world failure which produces a problem, you try to fix it, but you will probably never know how well you fixed it, because it will never fail in exactly the same way again. If you try to manufacture defective drives (i.e., spike 'em with a powder-actuated nail gun while they are spinning), you will rack up a lot of money rapidly (at least for a volunteer project) (but it IS fun!). So, yes, I'm saying there are probably bugs in how HW failures are handled in OpenBSD...and probably most other OSs. It just isn't something you can test effectively, but only refine it over years of (bitter) experience. I've always told people RAID is part of a rapid-repair solution, not part of a "never goes down". It *may* not go down. Maybe, probably won't go down. But don't bet your career on it. Plan for the worst case, and things will always look better than expected. And you look like a genius. :) Nick.