On 12/26/2012 10:17 PM, Miles Fidelman wrote: > Does this make sense, or is it totally crazy?
Simple stupid is usually the best. Twisty maze of little layers of indirection tends to be fragile and unmaintainable. Over here 95% of downtime is caused by maintenance reboots (kernel/libc upgrades) and 95% of hardware failures are dying disks -- no downtime there as they're raided. The other 5% is basically not worth the effort -- in terms of my time and hardware costs it's cheaper to let it break and if needed pull an overnighter picking up the pieces. (Obviously, partitioning your services so one server doesn't take them all out helps too.) So what is it that you're trying to protect against with your HA cluster? Dima _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
