-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/21/2013 11:15 AM, Andreas Kurz wrote: > On 2013-03-21 14:31, Patrick Hemmer wrote: >> I've got a 2-node cluster where it seems last night one of the nodes >> went offline, and I can't see any reason why. >> >> Attached are the logs from the 2 nodes (the relevant timeframe seems to >> be 2013-03-21 between 06:05 and 06:10). >> This is on ubuntu 12.04 > > Looks like your non-redundant cluster-communication was interrupted at > around that time for whatever reason and your cluster split-brained. > > Does the drbd-replication use a different network-connection? If yes, > why not using it for a redundant ring setup ... and you should use STONITH. > > I also wonder why you have defined "expected_votes='1'" in your > cluster.conf. > > Regards, > Andreas But shouldn't it have recovered? The node shows as "OFFLINE", even though it's clearly communicating with the rest of the cluster. What is the procedure for getting the node back online. Anything other than bouncing pacemaker?
Unfortunately no to the different network connection for drbd. These are 2 EC2 instances, so redundant connections aren't available. Though since it is EC2, I could set up a STONITH to whack the other instance. The only problem here would be a race condition. The EC2 api for shutting down or rebooting an instance isn't instantaneous. Both nodes could end up sending the signal to reboot the other node. As for expected_votes=1, it's because it's a two-node cluster. Though I apparently forgot to set the `two_node` attribute :-( - -Patrick -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRS8RSAAoJED0CF0ckHb4J5/4IAIBTh92ySD9NatBjanOtvwIZ G7ldoPD/o//pOD8A76ZzJnbN+m5PQ1cykpwuC6j+l+fHbkYlDHYEnjbrdRS2dJFY i1PibEIIOjeEAiK9PmCphKQ2qbkrKJXB0QdFD0EZjFFeatNfx/MBHInTBVdFa5MI wZ19qcNELxHZHsrAfgFxYGzKvA1mCVZuRhFXpMoZJ9vo3RUFT1GaLbLA/k8+NHgQ qPbmiYR0RI1cB+HqWl/Hn+PpWnV9zrF/vcZXISHp+cWpZ+IxzmDowR6iIHP+tC7N AslkXAfz4BlH0cuM2kjA9ZdkApzGttH7GkMyOrOQ4Rv8rV4teQjMtPogMcqdFuc= =lYXu -----END PGP SIGNATURE----- _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org