On Thu, Jul 19, 2012 at 7:11 PM, Tom Tux <tomtu...@gmail.com> wrote: > Hi > > When I reboot one of our two-node-cluster-boxes (sles11 sp1, fully > patched, HAE installed, the node does not rejoin himself to the > cluster. I got the following error: > > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.attrd failed: ipc delivery failed (rc=-2) > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message to > local.cib failed: ipc delivery failed (rc=-2) > > > > The corosync-objctl-tool knows both members as joined: > $ corosync-objctl | grep member > runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(88.88.88.88) > runtime.totem.pg.mrp.srp.members.1.join_count=1 > runtime.totem.pg.mrp.srp.members.1.status=joined > runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(99.99.99.99) > runtime.totem.pg.mrp.srp.members.2.join_count=1 > runtime.totem.pg.mrp.srp.members.2.status=joined > > > The 'crm status' gives the following output: > $ crm status > Connection to cluster failed: connection failed > > > > After a manual restart (/etc/init.d/openais restart), the node rejoins > successfully. Any reasons/hints, why the node doesn't do the rejoin > within the normal init-procedure?
Sounds like the cib isn't running perhaps? Maybe look for a clue in the pacemaker startup logs > > Many thanks. > Tom > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org