----- Original Message ----- > Hi! > > I guess it would be better to start a separate thread on this. > > I have a VM with pacemaker-remote installed. > > Stack: cman > Current DC: wings1 - partition with quorum > Version: 1.1.10-14.el6-368c726 > 3 Nodes configured > 2 Resources configured > > > Online: [ oracle-test:vm-oracle-test wings1 wings2 ]
The remote-node in this case is named 'oracle-test'. The remote-node's container resource is 'vm-oracle-test'. Internally pacemaker makes a connection resource named after the remote-node. That resource represents the pacemaker_remote connection. Kind of confusing I know. Here's the point. The connection resource 'oracle-test' is what is timing out here, not the vm itself. By default the connection resource has a 60 second timeout. If you want to increase that timeout use the remote-connect-timeout resource metadata option. You don't have to fully understand how all this works, just know that the remote-connection-timeout option needs to be greater than the time it takes for the virtual machine to fully initialize. http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#s-resource-options Hope that helps! -- Vossel > > vm-oracle-test (ocf::heartbeat:VirtualDomain): Started wings2 > > 2 resources configured... > > However, > > # pcs resource show > vm-oracle-test (ocf::heartbeat:VirtualDomain): Started > > As I understand, pacemaker considered pacemaker-remote on the VM as some sort > of 'virtual resource' (called 'oracle-test' in my case), since I have only > one 'primitive' section (VirtualDomain) in my CIB. > > Well, the problem is here: > > Sep 15 12:28:13 wings1 crmd[13553]: error: process_lrm_event: LRM operation > oracle-test_start_0 (8397) Timed Out (timeout=60000ms) > Sep 15 12:28:13 wings1 crmd[13553]: warning: status_from_rc: Action 7 > (oracle-test_start_0) on wings1 failed (target: 0 vs. rc: 1): Error > Sep 15 12:28:13 wings1 crmd[13553]: warning: update_failcount: Updating > failcount for oracle-test on wings1 after failed start: rc=1 > (update=INFINITY, time=1 > 410769693) > > Timeout is 60 seconds! Even though I have: > > <primitive class="ocf" id="vm-oracle-test" provider="heartbeat" > type="VirtualDomain"> > <instance_attributes id="vm-oracle-test-instance_attributes"> > <nvpair id="vm-oracle-test-instance_attributes-hypervisor" name="hypervisor" > value="qemu:///system"/> > <nvpair id="vm-oracle-test-instance_attributes-config" name="config" > value="/etc/libvirt/qemu/oracle-test.xml"/> > </instance_attributes> > <operations> > <op id="vm-oracle-test-monitor-interval-60s" interval="60s" name="monitor"/> > <op id="vm-oracle-test-start-timeout-300s-interval-0s-on-fail-restart" > interval="0s" name="start" on-fail="restart" timeout="300s"/> > <op id="vm-oracle-test-stop-timeout-60s-interval-0s-on-fail-block" > interval="0s" name="stop" on-fail="block" timeout="60s"/> > </operations> > > Moreover, VirtualDomain RA has this: > > <actions> > <action name="start" timeout="90" /> > <action name="stop" timeout="90" /> > <action name="status" depth="0" timeout="30" interval="10" /> > <action name="monitor" depth="0" timeout="30" interval="10" /> > <action name="migrate_from" timeout="60" /> > <action name="migrate_to" timeout="120" /> > > > My VM is unable to start in 60 seconds. What could be done here? > > -- > Best regards, > Alexandr A. Alexandrov > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org