On Sun, Nov 22, 2009 at 5:53 PM, Nick Holland <n...@holland-consulting.net> wrote: > Theodore Wynnychenko wrote: > ... >> Anyway, I had not considered the possibility of a controller failure. > > bad... > >> I also wondered if it was possible to remove a drive from the mirrored >> hardware array, and see if it is recognized by a plain old SATA controller. >> So, I did this by shutting the system down, enabling the motherboard's SATA >> controller, and moving one of the drive cables to this standard SATA >> controller. > > good... > >> Unfortunately, while the array comes up and is accessible, even though it is >> degraded; I cannot access the drive now attached to the standard SATA >> controller. If I try to fsck it, I get an "unknown special file or file >> system" message. >> >> So, it seems, with ami and this megaraid card, I will not be able to recover >> from a controller failure by hooking a drive up to a standard SATA >> controller. > > and thus, you find that one person's experience with ONE set of hardware > can not be universally generalized. > >> So, my question: How likely is a raid controller failure (with the LSI >> Megaraid PCI cards), > > Wrong question. > Right question would be: so what do you do when it does fail? > > Spare RAID hardware is required if rapid repair is needed if diving > for your backup system is not what you are after, after the failure of > one little component. And...I do believe that is the point of RAID. > >> and would I be better off just chucking the Megaraid >> card and using software raid with the drives connected via the standard SATA >> controllers? > > If you have a single, non-redundant drive, you KNOW you have a significant > exposure to failure, and you will have backups and such, or be ready to > take a bit of downtime and data loss. > When you implement RAID, you start telling yourself you have all > kinds of tolerance to unhappy events, and then you start to believe it. > > With a spare controller, you could have rapid repair. > Without a spare controller, the failure of the controller has the EXACT > same impact that the failure of a drive on a single drive, non-RAID system: > complete and total loss of data on the system and recovery of your data > from the backup which you wish you had been making. > > here's the fun (=terrifying) part: WITH a spare controller, things can > get at least as unpleasant as above. > > If you just set up your RAID system and "get it working" and then toss it > into production, and hope magic will happen when something breaks, you > would probably have been better off with a simpler configuration with no > RAID. I'm serious. The worst data-loss events I have seen involved > RAID. If you don't understand your chosen RAID solution, given > enough time and/or enough opportunity, your downtime will be extended, > and your data will be lost. > > Not only should you have "similar" spare controllers, you need IDENTICAL > spare controllers -- right down to the firmware versions. Yes, I have > seen firmware notes which had big warnings of "unable to read arrays > made by version X of the firmware or hardware". Are you ready to bet > that every option on this version of the card you have is compatible > with every option of the card you hope to buy when your existing one > dies? If you believe this, I ask you a question: why are there so many > updates to RAID controller firmware made if they are so perfect? > What's the first bit of advice you always seem to hear on a new server > setup? "Update the firmware". Why? Because the old one was crap. > Amazingly, the one you just installed is now perfect. Yeah, right. > > It's a numbers game. If you have five disks on a machine, maybe you > have a 1 in 4 chance of failing in a two year life cycle of the machine. > If you put a sixth disk in the machine and RAID the bunch of 'em, you > get a MUCH lower likelihood of failure due to a disk, but much higher > due to failure of a RAID controller (which had no chance of failing > on the system that didn't have it), maybe 1 in 20 (or maybe one in 200 > or one in 2000..whatever). You also add the possibility of failure of > process (i.e., one drive fails, you say, "what's the rush?" and rather > than rushing out and buying a new drive, you send the old one off for > warranty replacement, and in the weeks you wait, a second drive fails. > I saw this recently... big array with a few terabytes of important > data blew out a disk. First failure was not having a spare disk on > hand. Second failure was rather than running out and BUYING a new disk > and putting it in the machine, the old drive was sent off for warranty > replacement, and the system ran without redundancy until it came back. > At that point, they put a very firm value on the safety of their data: > <$100. In this case, the disaster did not happen. Pulling numbers > out of thin air, I'd say failure of process might be something like 1 > in 3 when you think you have hardware to save you from failures...more > like 1 in 10 when you know you are living on the edge). > > (90% of all statistics are made up. 100% of those were...but they may > be closer than you would like). > > If that spare controller costs you $400, but loss of data or extended > downtime is worth $4000 to you, ok, maybe it is worth the gamble of > living without a spare. If your data is worth, say, $40k, maybe you > just better spend the money for a spare controller, regardless of what > you think the likelihood of failure is. > > But now that you have your RAID system in place, you had better > experiment with it to make sure you know how to handle drive failure > and replacement, and in the case of hardware, how to handle HARDWARE > failure and replacement (i.e., move the disk pack to a new controller). > Hint: it is often not as easy as you think to move the disk pack to > new hardware. It seems hardware vendors often don't consider that > anything other than disks fail. *sigh* > >> I figure there must be some performance loss, but I can't >> imagine I will ever notice it looking at old pictures. > > Here's an alternate potential solution: put two or more simple disks > in the computer, and periodically copy the data from the primary to > the backup... Not applicable in all cases, but actually superior > in some. It gives you a nice opportunity to think about and evaluate > changes you made before committing them to all media. (DNS servers > and firewalls are two cases where I have used this effectively: the > data isn't all that volatile, if you lose a day's changes, not the > end of the world, but if you don't do it right, it is really nice to > just copy the old file back quickly rather than go digging through > tape). > > Note that this system has its own potential failure modes -- if your > primary disk has lost data that the system hasn't noticed yet, it WILL > notice during the copy operation, and probably destroy the "mirror" > data in the process of noticing...so your backups are still important. > >> Finally, I am wondering. I had assumed that the hardware controller really >> didn't do that much when in RAID1, and just passed the writes/reads to/from >> both of the disks, resulting in 2 (basically) normal drives. Obviously, I >> was wrong. I am wondering why/how the raid controller needs to modify the >> disk's file system when it's only mirroring 2 drives? (I really could not >> find anything by google-ing around on this.) > > here's where I say, "you need to think about and simulate all imaginable > failure modes", and you start to understand how these things work. > > The hardware RAID systems I've worked with treat RAID1 as one of several > different RAID systems supported...they don't treat it significantly > different than RAID5, RAID1+0, etc. > > There is more to RAID1 than duping the data to two drives and keeping them > the same. All HW RAID systems that I have seen use some kind of signature > to mark the drives as part of a RAID set (RAID1 or otherwise). The > signature is quite important: > > * Let's say you have six drives, three RAID1 pairs. While servicing the > computer, you unplug the drive cables to extract or install some other > card in the machine...you then realize you didn't make note of what cable > went where. How do you want your RAID controller to handle this? You > probably really hope it spots the pairs and re-connects them appropriately. > > * What if you need to swap out a drive while the system is off? Which > drive does it use on power-up? > > * What if you replace a drive while the system is off with another drive > that was recycled from a system with a compatible RAID card? Which is the > one that should be used and which should be ignored? > > * What if you power up a system and it has two drives attached that it > knows nothing about? Pick one and blindly copy it to the other? Assume > they are in sync? > > * What if you can't replace the failed drive with one that is identical > to it? What if the new drive is a bigger or a few blocks smaller? > > The signature that helps resolve the above has to be somewhere on the > disk. Some systems try to hide it some place the OS would never notice > (I believe I read some tech notes on one system that stuck it on the > very last sector of the disk, with the assumption that very few OSs ever > put anything there, I've seen one other RAID system that seemed to do > that, as the drives COULD be removed from the RAID system and used > directly on a standard controller), but others just plop it at the > front of the physical disk, and create the array in the remaining > space. > > The point of RAID1 isn't to dupe the data on two drives, the point of > RAID1 is to have the system rapidly recoverable when something goes > horribly wrong. Duping the data between two drives is the way to meet > that end, but there is more to it than two blind copies of the same > data. > >> I hope I don't sound too clueless for asking. > > No more clueless than 90% of the people out there setting up RAID > disasters in waiting... Many very smart people who seem to think > that Magic Happens (or that think they will have a job elsewhere) > when things go wrong. > > Nick. > >
You hit on a lot of things that I didn't even consider, Nick. I think very few people think about the controller when they design a RAID array. Even fewer, as you said, would consider how said controller works in the event of a drive failure. Every method has its caveats. You could chuck two identical RAID cards into a box and put 2 striped disks on each, mirroring the two in software - but you'd also need to understand how the system copes with a drive failing on either card - and realise that in the improbable situation where a drive fails on both cards, you're toast unless you implemented a bulletproof backup plan when you put the damn thing into production. You'd also need to know how to get the stripe on a card going again, and then rebuild the mirror. You'd need to understand how this works as well so you don't go trashing your mirror during the rebuild because it didn't go how you thought it would. "Magic blue smoke" is for the computer illiterate. As the administrator of the system you're installing, you need to have a comprehensive understanding of its inner workings. As we now realise today, this includes the underlying hardware and the firmware that drives it. The only way I can see this being done is to test every scenario you think you'll encounter and watch the results, including what happens when you (need to) intervene. Write up a testing plan with what you expect to happen with each type of failure and how you'll simulate it. When you do each test, write down what really happened, especially if it differs in any way to your expectations. That's the only way I can see you being able to understand your beast. Once you've done that, if the machine was intended for use in a production environment, I'd buy the same stuff again and build a new box, using the current one as a failover or development machine - it would have been stressed in the testing and can't be considered reliable for a production environment. I've learned a lot from your post (and I'm sure many others have). Thank you, Nick. If I could, I'd shout you a beer (or whatever your poison is). Feel free to correct me if you think I am wrong. -- Aaron Mason - Programmer, open source addict I've taken my software vows - for beta or for worse