Additional information from the charm: Without cluster_count set to NUM_UNITS a race occurs where the relation to the last hacluster node is not yet set leading to the attempt to startup corosync and pacemaker with only n-1/n nodes.
The last node only has one relationship it is aware of yet when there should be 2 relations: relation-list -r hanode:0 hacluster/0 corosync.conf looks like the following when there should be 3 nodes: nodelist { node { ring0_addr: 10.5.35.235 nodeid: 1000 } node { ring0_addr: 10.5.35.237 nodeid: 1001 } } The services themselves (not the charm) fail: corosync logs thousands of RETRANSMIT errors. pacemaker eventually times out after waiting on corosync. Adding more documentation to push the setting of cluster_count and updating the amulet tests to include it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1654403 Title: Race condition in hacluster charm that leaves pacemaker down To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1654403/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs