On 31 Jul 2014, at 4:46 pm, Cédric Dufour - Idiap Research Institute 
<cedric.duf...@idiap.ch> wrote:

> On 31/07/14 00:17, Andrew Beekhof wrote:
>> On 31 Jul 2014, at 2:48 am, Cédric Dufour - Idiap Research Institute 
>> <cedric.duf...@idiap.ch> wrote:
>> 
>>> After packaging pacemaker 1.1.12 for Debian/Wheezy (along corosync 1.4.6 
>>> and libqb 0.17.0), I have successfully initialized a new cluster.
>>> 
>>> Back to a very simple test cluster, the only problem I have is with 
>>> fencing, which fails altogether with "route_ais_message: Sending message to 
>>> local.stonith-ng failed: ipc delivery failed (rc=-2)" messages:
>>> 
>>> root@bc1hs22a01:~ # tail /var/log/corosync.rsyslog
>>> Jul 30 18:41:41 bc1hs22a01 stonith_admin[5411]:   notice: crm_log_args: 
>>> Invoked: stonith_admin -F bc1hs22a02
>>> Jul 30 18:41:41 bc1hs22a01 stonithd[4754]:   notice: handle_request: Client 
>>> stonith_admin.5411.fe1388ed wants to fence (off) 'bc1hs22a02' with device 
>>> '(any)'
>>> Jul 30 18:41:41 bc1hs22a01 stonithd[4754]:   notice: 
>>> initiate_remote_stonith_op: Initiating remote operation off for bc1hs22a02: 
>>> 48b69f82-29ad-4c9a-af57-0e60ae5242e4 (0)
>>> Jul 30 18:41:41 bc1hs22a01 corosync[4686]:   [pcmk  ] WARN: 
>>> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery 
>>> failed (rc=-2)
>> rc=-2 is coming from send_client_ipc(void *conn, const AIS_Message * ais_msg)
>> 
>> specifically:
>> 
>>    if (conn == NULL) {
>>        rc = -2;
>> 
>> So the plugin thinks that stonith-ng isn't connected.
>> More logs?
>> 
> 
> I have completed a full restart of the cluster in order to provide the logs 
> at each step; see attached log files:
> (from node_1/DC)
> - node_1-corosync-start.log
> - node_1-pacemaker-start.log
> - node_1-corosync-node_2_join.log
> - node_1-pacemaker-node_2_join.log
> (from node_2)
> - node_2-corosync-start.log
> - node_2-pacemaker-start.log
> 
> The problem manifests itself already in DC start log - because of previous 
> fencing attempt - at 08:19:21 and 08:19:42:
> 
> root@bc1hs22a01:~ # fgrep 'ipc delivery failed' node_1-corosync-start.log
> Jul 31 08:19:21 bc1hs22a01 corosync[31057]:   [pcmk  ] WARN: 
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery 
> failed (rc=-2)
> Jul 31 08:19:42 bc1hs22a01 corosync[31057]:   [pcmk  ] WARN: 
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery 
> failed (rc=-2)
> 
> While it would seem (to me) that the stonith plugin successfully connected to 
> the CIB:

Its not the CIB thats the issue:

>>> Jul 30 18:41:41 bc1hs22a01 corosync[4686]:   [pcmk  ] WARN: 
>>> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery 
>>> failed (rc=-2)

Thats the pacemaker plugin inside corosync (which uses a completely different 
IPC mechanism).

FWIW, the plugin is extremely deprecated, you're encouraged to use 
pacemaker+cman or begin working towards corosync2 + pacemakerd.

> 
> root@bc1hs22a01:~ # fgrep cib_native_signon_raw node_1-pacemaker-start.log
> Jul 31 08:19:20 [31096] bc1hs22a01       crmd:    debug: 
> cib_native_signon_raw:     Connection unsuccessful (0 (nil))
> Jul 31 08:19:20 [31096] bc1hs22a01       crmd:    debug: 
> cib_native_signon_raw:     Connection to CIB failed: Transport endpoint is 
> not connected
> Jul 31 08:19:20 [31092] bc1hs22a01   stonithd:    debug: 
> cib_native_signon_raw:     Connection unsuccessful (0 (nil))
> Jul 31 08:19:20 [31092] bc1hs22a01   stonithd:    debug: 
> cib_native_signon_raw:     Connection to CIB failed: Transport endpoint is 
> not connected
> Jul 31 08:19:21 [31096] bc1hs22a01       crmd:    debug: 
> cib_native_signon_raw:     Connection to CIB successful
> Jul 31 08:19:21 [31092] bc1hs22a01   stonithd:    debug: 
> cib_native_signon_raw:     Connection to CIB successful
> Jul 31 08:19:25 [31094] bc1hs22a01      attrd:    debug: 
> cib_native_signon_raw:     Connection to CIB successful
> 
> Best,
> 
> Cédric
> 
> <node_1-corosync-start.log><node_1-pacemaker-start.log><node_1-corosync-node_2_join.log><node_1-pacemaker-node_2_join.log><node_2-corosync-start.log><node_2-pacemaker-start.log>_______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to