On Thu, Sep 2, 2010 at 10:50 AM, Thomas Guthmann <tguthm...@iseek.com.au> wrote: > Hi Andrew, > > First thanks for remembering my issue and looking into it :) > >> Jul 30 11:37:50 [..] > > Yes but... See the time line pasted below. (at 11:37, it starts to do > something) > >>> 11:20AM : cluster is up and running >>> 11:25AM : shutdown the IP >>> 11:30AM : force a refresh with attrd_updater (because pingd=1 still) >>> It doesn't change anything still seen as up... >>> 11:37AM : change a value in the CIB dampen from 120 to 121 for instance >>> Now db2 pingd is null but db1 is still 1. crm changes have >>> been done on db2 - dunno if it's linked. >>> 11:40AM : start the IP again >>> 12:00AM : IP is still seen as down... > > So if you can look earlier in the logs, you might see the problem. Around > 11:25AM I shutdown the IP (see the timeline above) so the CIB should have > been updated with pingd=0 for both nodes but it's not or half done. At > 11:37, I updated a value in the config which usually force a flush of the > CIB and fix everything, that's what you saw.
Ahhhh > I can redo a better test so you > can maybe see more. So far, when the gateway flips and then pacemaker goes > "berko", my trick to fix the status is attrd_update -R (sorry can't remember > the correct syntax on top of my memory) and then everything is fine again. > Something doesn't update the CIB for sure but I don't know what. > >> Alas there is no debug running so I can't say for sure that the call >> returned, but this makes it pretty likely: > > Anyway, how do I enable more debug so we can see what doesn't update the CIB Assuming heartbeat, add the following to ha.cf debug 1 For corosync, its: debug: on in corosync.conf > ? Then I will give you a fresh hb_report :) That would be helpful, thanks. > > Cheers, > Thomas > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker