Hi! I have a server that operates about 30 virtual machines. Normally it handles this load very well, but restart can be a bit dicey. I have found that by staggering the vm startups - currently done manually - the system handles the growing load much more gracefully. The sequence goes something like this: 1. node reboots 2. pacemaker (and related) is started 3. immediately, all vm resources are stopped (for X in `crm status --inactive | grep....`...; do crm resource stop $X...) 4. once pacemaker has brought the node online, all vm resource are started one at a time (for X...; crm resource start $X; sleep 45s; done)
There's two things I'd like to accomplish, but if I can only get one, that would be fine too. First and foremost, I'd like to have Pacemaker stagger the startup of certain resources according to a time delay. Although in the example above the node is rebooted, in a two-or-more-node case a single node failure might dump a significant number of resources onto the surviving nodes, and (more significantly), thereby dumping a huge amount of load on the SAN that backs the vm host(s). Having the vm startups or restarts staggered automatically would help mitigate this. Staggering should be relative to other relevant resources. (Ordering takes care of delaying the vms from starting till after the SAN stores mount, but each VM should wait a while before another VM kicks off. A failure of one VM to start should not prevent other VMs from starting.) Second, I think it would be useful to be able to group the resources together for staggered startup. For instance, most of my vms are linux, and they boot very quickly with little load. Some are Windows, and they load the host and SAN very badly on boot. I would ideally create small groups of linux hosts (to be started together) and start the windows hosts one at a time (or, another way to think of it, put them in groups of one each, so that I'm staggering the groups instead of the individual resources). A key to making this work will be specifying the delay between starting successive vms/groups. The vm-start command returns from libvirt almost immediately, but I want to wait for virtual machine to boot a while - something I don't know yet how to easily check for in pacemaker. Although it does seem a little kludgey to put an arbitrary time delay, it also appears to be very effective for my situation. NB: the groups I describe above have no relationship to groups in the classical Pacemaker sense; they don't have to live together, nor is there necessarily a hard order of startup or shutdown described. If one resource in a staggering-group fails or is stopped, it has no effect on the rest of the group. There is only the notion that those resources should be started together, and started after or before some other group of resources + a time delay. In essence, whereas Pacemaker groups describe what to start, I am looking to describe when to start. I don't think stop-staggering has much use here, though I suppose executing large batches of stops the same way as staggered-start would prevent the vms from all flushing to the SAN at the same time. Is there a way to do this with the latest Pacemaker? (Sorry this got a bit long-winded...) Thanks!! -- Matthew -- CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return e-mail and delete the original message and any copies of it from your computer system. Thank you. EXPORT CONTROL WARNING: This document may contain technical data that is subject to the International Traffic in Arms Regulations (ITAR) controls and may not be exported or otherwise disclosed to any foreign person or firm, whether in the US or abroad, without first complying with all requirements of the ITAR, 22 CFR 120-130, including the requirement for obtaining an export license if applicable. In addition, this document may contain technology that is subject to the Export Administration Regulations (EAR) and may not be exported or otherwise disclosed to any non-U.S. person, whether in the US or abroad, without first complying with all requirements of the EAR, 15 CFR 730-774, including the requirement for obtaining an export license if applicable. Violation of these export laws is subject to severe criminal penalties.
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org