On Wed, Jun 27, 2012 at 6:45 AM, Brian J. Murrell <br...@interlinx.bc.ca> wrote: > So, I have an 18 node cluster here (so a small haystack, indeed, but > still a haystack in which to try to find a needle) where a certain > set of (yet unknown, figuring that out is part of this process) > operations are pooching pacemaker. The symptom is that on one or > more nodes I get the following kinds of errors: > > # cibadmin -Q > Call cib_query failed (-41): Remote node did not respond > > along with similar things in the log: > > Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update > 7 failed: (rc=-41) Remote node did not respond > Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update > 8 failed: (rc=-41) Remote node did not respond > Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update > 9 failed: (rc=-41) Remote node did not respond > Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update > 10 failed: (rc=-41) Remote node did not respond > Jun 26 19:51:39 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update > 11 failed: (rc=-41) Remote node did not respond > > Clearly some node in the cluster has a problem, but nothing in any > of these messages is helping me figure out which one it is. > > Any hints on how I figure out which node this "iu-18" node is having > problems communicating with?
The DC, possibly you didn't have one at that moment in time. Were there (m)any membership events occurring at the time? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org