On 29/04/2013, at 2:48 PM, Mark Williams <m...@mwp.id.au> wrote: > Hi all, > > My two node cluster (qemu VMs, drbd) is now in quite a messy state. > > The problem started with a unresponsive qemu VM, which appeared to be > caused by a libvertd problem/bug. > Others said the solution was to kill & restart libvertd which didnt help. > To fix the problem, i figured putting the node1 into a clean state via > server reboot would be the best idea, so i issued a crm standby > command.
That doesn't initiate a reboot. It only tells pacemaker to try and stop all the resources running on node1 > > I now have node1 in a standby state, but the resources/vm's that were > (and still are) running on it have a "Master Started (unmanaged) > FAILED" state according to crm_mon. Most likely because they refused to stop and you have no fencing. > > Any actions i try to perform on that node (for example, moving a > resource to the other node) results in a "unknown exec error". > > I tried using crm_resource -C on a node1 "Started (unmanaged) FAILED" > resource, which changed its state to "Master (unmanaged) FAILED" (it > did shutdown the running qemu VM). > Trying to move that resource to node2 still fails with "unknown exec error". > > How do i get out of this problem? Step 1 is provide more information. Likely from your logs. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org