On Tue, Oct 29, 2013 at 12:10:26PM PDT, matthewhall spake thusly: > May I enquire as to the nature of the filesystems on these VMs? > It surprises me that a sudden inability to write to the block device beneath > is causing such hassle at the FS layer, ext3 upward (as is standard under > RH) has a pretty robust journal system.
I work in a similar environment (although with a more reliable network, fortunately). I use Xen with iSCSI backend. When the network fails the filesystem effectively disappears from the VM. It's as if you just reached into the chassis and yanked out the SATA cables. That is going to disrupt any machine. This has nothing to do with ext3 etc. When the machine comes back up sometimes it has fsck errors. These are usually resolved with fsck -y although I hesitate to make that happen automatically. And then there are always the mysql tables in need of repair, services which did not come back up automatically, etc. Finding a non-manual way to handle this when you have thousands of VMs is a very hard problem. I would suggest considering not putting all of the eggs in one datacenter basket. Either by having completely separate and independent power circuits in a datacenter or using a completely separate datacenter. This is useful in being able to play them off each other for price as well. -- Tracy Reed, RHCE Digital signature attached for your safety. Copilotco PCI/HIPAA/SOX Compliant Secure Hosting 866-MY-COPILOT x101 http://copilotco.com
pgpY_BB7u7hEy.pgp
Description: PGP signature
_______________________________________________ Tech mailing list Tech@lists.lopsa.org https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/