On 31 Jul 2014, at 4:46 pm, Cédric Dufour - Idiap Research Institute <cedric.duf...@idiap.ch> wrote:
> On 31/07/14 00:17, Andrew Beekhof wrote: >> On 31 Jul 2014, at 2:48 am, Cédric Dufour - Idiap Research Institute >> <cedric.duf...@idiap.ch> wrote: >> >>> After packaging pacemaker 1.1.12 for Debian/Wheezy (along corosync 1.4.6 >>> and libqb 0.17.0), I have successfully initialized a new cluster. >>> >>> Back to a very simple test cluster, the only problem I have is with >>> fencing, which fails altogether with "route_ais_message: Sending message to >>> local.stonith-ng failed: ipc delivery failed (rc=-2)" messages: >>> >>> root@bc1hs22a01:~ # tail /var/log/corosync.rsyslog >>> Jul 30 18:41:41 bc1hs22a01 stonith_admin[5411]: notice: crm_log_args: >>> Invoked: stonith_admin -F bc1hs22a02 >>> Jul 30 18:41:41 bc1hs22a01 stonithd[4754]: notice: handle_request: Client >>> stonith_admin.5411.fe1388ed wants to fence (off) 'bc1hs22a02' with device >>> '(any)' >>> Jul 30 18:41:41 bc1hs22a01 stonithd[4754]: notice: >>> initiate_remote_stonith_op: Initiating remote operation off for bc1hs22a02: >>> 48b69f82-29ad-4c9a-af57-0e60ae5242e4 (0) >>> Jul 30 18:41:41 bc1hs22a01 corosync[4686]: [pcmk ] WARN: >>> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery >>> failed (rc=-2) >> rc=-2 is coming from send_client_ipc(void *conn, const AIS_Message * ais_msg) >> >> specifically: >> >> if (conn == NULL) { >> rc = -2; >> >> So the plugin thinks that stonith-ng isn't connected. >> More logs? >> > > I have completed a full restart of the cluster in order to provide the logs > at each step; see attached log files: > (from node_1/DC) > - node_1-corosync-start.log > - node_1-pacemaker-start.log > - node_1-corosync-node_2_join.log > - node_1-pacemaker-node_2_join.log > (from node_2) > - node_2-corosync-start.log > - node_2-pacemaker-start.log > > The problem manifests itself already in DC start log - because of previous > fencing attempt - at 08:19:21 and 08:19:42: > > root@bc1hs22a01:~ # fgrep 'ipc delivery failed' node_1-corosync-start.log > Jul 31 08:19:21 bc1hs22a01 corosync[31057]: [pcmk ] WARN: > route_ais_message: Sending message to local.stonith-ng failed: ipc delivery > failed (rc=-2) > Jul 31 08:19:42 bc1hs22a01 corosync[31057]: [pcmk ] WARN: > route_ais_message: Sending message to local.stonith-ng failed: ipc delivery > failed (rc=-2) > > While it would seem (to me) that the stonith plugin successfully connected to > the CIB: Its not the CIB thats the issue: >>> Jul 30 18:41:41 bc1hs22a01 corosync[4686]: [pcmk ] WARN: >>> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery >>> failed (rc=-2) Thats the pacemaker plugin inside corosync (which uses a completely different IPC mechanism). FWIW, the plugin is extremely deprecated, you're encouraged to use pacemaker+cman or begin working towards corosync2 + pacemakerd. > > root@bc1hs22a01:~ # fgrep cib_native_signon_raw node_1-pacemaker-start.log > Jul 31 08:19:20 [31096] bc1hs22a01 crmd: debug: > cib_native_signon_raw: Connection unsuccessful (0 (nil)) > Jul 31 08:19:20 [31096] bc1hs22a01 crmd: debug: > cib_native_signon_raw: Connection to CIB failed: Transport endpoint is > not connected > Jul 31 08:19:20 [31092] bc1hs22a01 stonithd: debug: > cib_native_signon_raw: Connection unsuccessful (0 (nil)) > Jul 31 08:19:20 [31092] bc1hs22a01 stonithd: debug: > cib_native_signon_raw: Connection to CIB failed: Transport endpoint is > not connected > Jul 31 08:19:21 [31096] bc1hs22a01 crmd: debug: > cib_native_signon_raw: Connection to CIB successful > Jul 31 08:19:21 [31092] bc1hs22a01 stonithd: debug: > cib_native_signon_raw: Connection to CIB successful > Jul 31 08:19:25 [31094] bc1hs22a01 attrd: debug: > cib_native_signon_raw: Connection to CIB successful > > Best, > > Cédric > > <node_1-corosync-start.log><node_1-pacemaker-start.log><node_1-corosync-node_2_join.log><node_1-pacemaker-node_2_join.log><node_2-corosync-start.log><node_2-pacemaker-start.log>_______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org