Hi Ulrich, Yes, `crm_verify -L` is fine.
Regards, James On 10/26/2012 12:34 PM, Ulrich Windl wrote: > Hi! > > Just one idea: "crm_verify -L" is fine? > > Regards, > Ulrich > >>>> James Guthrie <[email protected]> schrieb am 26.10.2012 um 11:14 in Nachricht > <[email protected]>: >> Hi Emmanuel, >> >> I should maybe have mentioned earlier that I'm not using either of the >> subshells for pacemaker, I'm configuring everything via XML. Also, I >> don't and won't have python compiled in my environment, so any crm >> commands are a no-go. >> >> Regards, >> James >> >> >> On 10/26/2012 11:10 AM, Emmanuel Saint-Joanis wrote: >>> just to see the syntax (not easy in XML), if it shows something >>> obviously bad >>> can U paste the : crm configure show >>> >>> 2012/10/26 James Guthrie <[email protected] <mailto:[email protected]>> >>> >>> Hi Emmanuel, >>> >>> It might help for further debugging to attach my pacemaker config, so >>> here's a pastebin of `cibadmin -Ql` as it is on the cluster right now - >>> still in the state of one node being "offline" and the other online. >>> >>> http://pastebin.com/s3kr6Fxx >>> >>> As you can see in the config, I have stonith disabled. >>> >>> Regards, >>> James >>> >>> On 10/26/2012 10:48 AM, Emmanuel Saint-Joanis wrote: >>> > It seems like (CRMd/pEngine) thinks : "I didn't manage to shoot the >>> > failing node, therefore I (kind of) blacklist it as soon as I get >>> > control on it" >>> > Did you test extensively that your config works with -> >>> > stonith-enabled="false" <- first ? >>> > >>> > >>> > 2012/10/26 James Guthrie <[email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected]>>> >>> > >>> > Hi Emmanuel, >>> > >>> > corosync is bound to the correct interface on both hosts. >>> > >>> > I looked for that line in the logs, but it didn't appear. >>> > >>> > My previous e-mail addressed to Ulrich contains logfiles and >>> a broad >>> > explanation of the process that those logfiles capture. >>> > >>> > Regards, >>> > James >>> > >>> > On 10/25/2012 06:34 PM, Emmanuel Saint-Joanis wrote: >>> > > Looks like a common timeout issue in network upcoming. >>> > > >>> > > See if corosync is bound to 127.0.0.1 instead of real >>> interface >>> > with : >>> > > corosync-cmapctl | grep member >>> > > >>> > > Also check if no line is appearing in /var/log/messages : >>> > > WARN: cib_peer_callback: Discarding cib_apply_diff message >>> (322) from >>> > > server2: not in our membership >>> > > >>> > > Send logs to any web service as pastebin.com >>> <http://pastebin.com> >>> > <http://pastebin.com> <http://pastebin.com>. >>> > > >>> > > 2012/10/25 James Guthrie <[email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected]>> >>> > <mailto:[email protected] <mailto:[email protected]> <mailto:[email protected] >>> <mailto:[email protected]>>>> >>> > > >>> > > Hi all, >>> > > >>> > > I've been battling with this problem for a few hours now, >>> > I've gone over >>> > > the obvious errors that it could have been with the >>> guys in >>> > the linux-ha >>> > > IRC. I'd really like some help in trying to solve this >>> problem. >>> > > >>> > > I have a two node corosync/pacemaker cluster >>> (corosync: 2.0.1 >>> > pacemaker: >>> > > 1.1.8). I can get the cluster to work fine, but I can >>> also >>> > very easily >>> > > get the cluster into a state from which it seems unable >>> to >>> > recover. All >>> > > I have to do is reboot one of the cluster node's >>> hosts. When >>> > doing so, >>> > > any resources that were running on it are transferred >>> to the >>> > second >>> > > host. When the host comes back up though it appears as >>> > OFFLINE in the >>> > > crm_mon of both cluster nodes. >>> > > >>> > > Regardless of what I do on the "offline" host, nothing >>> gets >>> > better. If I >>> > > however stop and restart corosync/pacemaker on the other >>> > "online" host, >>> > > then everything seems to work again. >>> > > >>> > > I tried waiting a while with one node offline, after a >>> while >>> > the online >>> > > node went offline, stating that the other node was now >>> > offline. For a >>> > > few minutes the output of crm_mon was different on >>> both hosts >>> > (both >>> > > thought the other was online, they were offline). Then >>> finally it >>> > > settled in the exact opposite state as previously. >>> > > >>> > > I've had a long look through the logs but I don't seem >>> to be >>> > able to >>> > > pinpoint anything particular that tells me that there is >>> a >>> > reason for >>> > > that host failing to be online. >>> > > >>> > > I'd like to attach the logs, but thought that approx 1500 >>> > lines of >>> > > additional text in this e-mail might be a bit too much. >>> > > >>> > > How should I best attach the logs and config files? Which >>> > parts of the >>> > > logs and config files would most likely reveal the >>> problem in >>> > this case? >>> > > >>> > > Regards, >>> > > James >>> > > >>> > > _______________________________________________ >>> > > Linux-HA mailing list >>> > > [email protected] >>> <mailto:[email protected]> >>> <mailto:[email protected] >>> <mailto:[email protected]>> >>> > <mailto:[email protected] >>> <mailto:[email protected]> >>> > <mailto:[email protected] >>> <mailto:[email protected]>>> >>> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> > > See also: http://linux-ha.org/ReportingProblems >>> > > >>> > > >>> > >>> > _______________________________________________ >>> > Linux-HA mailing list >>> > [email protected] <mailto:[email protected]> >>> <mailto:[email protected] >>> <mailto:[email protected]>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> > See also: http://linux-ha.org/ReportingProblems >>> > >>> > >>> >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] <mailto:[email protected]> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >>> >> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
