So, I have an 18 node cluster here (so a small haystack, indeed, but
still a haystack in which to try to find a needle) where a certain
set of (yet unknown, figuring that out is part of this process)
operations are pooching pacemaker.  The symptom is that on one or
more nodes I get the following kinds of errors:

# cibadmin -Q
Call cib_query failed (-41): Remote node did not respond

along with similar things in the log:

Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 7 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 8 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 9 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 10 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:39 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 11 
failed: (rc=-41) Remote node did not respond

Clearly some node in the cluster has a problem, but nothing in any
of these messages is helping me figure out which one it is.

Any hints on how I figure out which node this "iu-18" node is having
problems communicating with?

Cheers,
b.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to