Looks like I found the issue but I don't know how to fix it: I looked into run_stonith_agent() in st_client.c
It is calling fence_legacy instead of external/ipmi. Since I don't know how to enable tracing for different subsystems so I put my own message in the code and this is what I got: ======== Oct 8 12:32:25 c687 stonith-ng: [1436]: notice: run_stonith_agent: action monitor with device fence_legacy =========== I'm using external/ipmi. This is what my cib looks like: ======= node ha1.itactics.com \ attributes standby="off" node ha2.itactics.com \ attributes standby="off" primitive ha1.itactics.com-stonith stonith:external/safe/ipmi \ op monitor interval="20" timeout="180" \ params target_role="started" hostname="ha1.itactics.com" ipaddr="192.168.2.3" primitive ha2.itactics.com-stonith stonith:external/safe/ipmi \ op monitor interval="20" timeout="180" \ params target_role="started" hostname="ha2.itactics.com" ipaddr="192.168.2.7" location ha1.itactics.com-stonith-placement ha1.itactics.com-stonith \ rule $id="ri-ha1.itactics.com-stonith-placement-1" -inf: #uname eq ha1.itactics.com location ha2.itactics.com-stonith-placement ha2.itactics.com-stonith \ rule $id="ri-ha2.itactics.com-stonith-placement-1" -inf: #uname eq ha2.itactics.com property $id="cib-bootstrap-options" \ dc-version="1.1.2-e0d731c2b1be446b27a73327a53067bf6230fb6a" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" ====== I'm confused why fence-legacy is getting called. Another side question is that I have build pacemaker with the following flag --with-trace-date how do I enable tracing to see the all the useful information that is printed out. Please help. Thanks Shravan On Thu, Oct 7, 2010 at 10:04 AM, Dejan Muhamedagic <deja...@fastmail.fm> wrote: > Hi, > > On Wed, Oct 06, 2010 at 01:32:06PM -0400, Shravan Mishra wrote: >> Please fine hb_report. > > hb_report couldn't find the logs, probably because you have both > syslog and to file logging. Anyway, it could be that stuff such > as external/safe/ipmi cannot work, i.e. that you can't create > subdirectories within external. Please try to move your agent > one directory up. > > Thanks, > > Dejan > >> Thanks >> Shravan >> >> On Wed, Oct 6, 2010 at 1:14 PM, Shravan Mishra <shravan.mis...@gmail.com> >> wrote: >> > Hi, >> > >> > Please find the stonith and crmd logs attached. >> > >> > I have pruned stonith.logs as it contained lots of repeatable messages. >> > >> > I 'm in the process of installing Date::parse perl module then I'll >> > send the info from hb_report. >> > >> > Thanks for your help >> > >> > Shravan >> > >> > >> > On Wed, Oct 6, 2010 at 11:26 AM, Dejan Muhamedagic <deja...@fastmail.fm> >> > wrote: >> >> Hi, >> >> >> >> On Wed, Oct 06, 2010 at 11:04:34AM -0400, Shravan Mishra wrote: >> >>> Hi guys, >> >>> >> >>> I'm having a weird problem with my stonith resources.They are >> >>> constantly starting and stopping. >> >>> >> >>> I'm using: >> >>> >> >>> pacemaker=1.1.3 >> >>> corosync=1.2.8 >> >>> glue=glue_1.0-10 >> >> >> >> Hmm, which version is this really? Can you do hb_report -V? You >> >> should be running at least 1.0.6. >> >> >> >>> 2.6.29.6-0.6.smp.gcc4.1.x86_64 >> >>> >> >>> My configuration looks like this: >> >>> >> >>> ======================= >> >>> node ha1.itactics.com >> >>> node ha2.itactics.com >> >>> primitive ha1.itactics.com-stonith stonith:external/safe/ipmi \ >> >>> op monitor interval="20" timeout="180" \ >> >>> params target_role="started" hostname="ha1.itactics.com" >> >>> ipaddr="192.168.2.3" >> >>> primitive ha2.itactics.com-stonith stonith:external/safe/ipmi \ >> >>> op monitor interval="20" timeout="180" \ >> >>> params target_role="started" hostname="ha2.itactics.com" >> >>> ipaddr="192.168.2.7" >> >>> location ha1.itactics.com-stonith-placement ha1.itactics.com-stonith \ >> >>> rule $id="ri-ha1.itactics.com-stonith-placement-1" -inf: #uname eq >> >>> ha1.itactics.com >> >> >> >> You can reduce such locations: >> >> >> >> location ha1.itactics.com-stonith-placement ha1.itactics.com-stonith >> >> -inf: ha1.itactics.com >> >> >> >>> location ha2.itactics.com-stonith-placement ha2.itactics.com-stonith \ >> >>> rule $id="ri-ha2.itactics.com-stonith-placement-1" -inf: #uname eq >> >>> ha2.itactics.com >> >>> property $id="cib-bootstrap-options" \ >> >>> dc-version="1.1.2-e0d731c2b1be446b27a73327a53067bf6230fb6a" \ >> >>> cluster-infrastructure="openais" \ >> >>> expected-quorum-votes="2" \ >> >>> stonith-enabled="true" >> >>> ========================= >> >>> >> >>> An excerpt from /var/log/messages >> >>> ========================== >> >>> Oct 6 11:00:02 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1150: monitor >> >>> Oct 6 11:00:03 ha1 lrmd: [5994]: info: cancel_op: operation >> >>> monitor[1150] on stonith::external/safe/ipmi::ha2.itactics.com-stonith >> >>> for client 5997, its parameters: CRM_meta_interval=[20000] >> >>> target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] >> >>> crm_feature_set=[3.0.2] CRM_meta_name=[monitor] >> >>> hostname=[ha2.itactics.com] cancelled >> >> >> >> Hmm, there should be more in the logs. Where do crmd and >> >> stonith-ng log? >> >> >> >> Thanks, >> >> >> >> Dejan >> >> >> >>> Oct 6 11:00:03 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1151: stop >> >>> Oct 6 11:00:03 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1152: start >> >>> Oct 6 11:00:03 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1153: monitor >> >>> Oct 6 11:00:04 ha1 lrmd: [5994]: info: cancel_op: operation >> >>> monitor[1153] on stonith::external/safe/ipmi::ha2.itactics.com-stonith >> >>> for client 5997, its parameters: CRM_meta_interval=[20000] >> >>> target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] >> >>> crm_feature_set=[3.0.2] CRM_meta_name=[monitor] >> >>> hostname=[ha2.itactics.com] cancelled >> >>> Oct 6 11:00:04 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1154: stop >> >>> Oct 6 11:00:04 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1155: start >> >>> Oct 6 11:00:04 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1156: monitor >> >>> Oct 6 11:00:06 ha1 lrmd: [5994]: info: cancel_op: operation >> >>> monitor[1156] on stonith::external/safe/ipmi::ha2.itactics.com-stonith >> >>> for client 5997, its parameters: CRM_meta_interval=[20000] >> >>> target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] >> >>> crm_feature_set=[3.0.2] CRM_meta_name=[monitor] >> >>> hostname=[ha2.itactics.com] cancelled >> >>> Oct 6 11:00:06 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1157: stop >> >>> Oct 6 11:00:06 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1158: start >> >>> Oct 6 11:00:06 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1159: monitor >> >>> Oct 6 11:00:07 ha1 lrmd: [5994]: info: cancel_op: operation >> >>> monitor[1159] on stonith::external/safe/ipmi::ha2.itactics.com-stonith >> >>> for client 5997, its parameters: CRM_meta_interval=[20000] >> >>> target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] >> >>> crm_feature_set=[3.0.2] CRM_meta_name=[monitor] >> >>> hostname=[ha2.itactics.com] cancelled >> >>> Oct 6 11:00:07 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1160: stop >> >>> Oct 6 11:00:08 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1161: start >> >>> Oct 6 11:00:08 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1162: monitor >> >>> Oct 6 11:00:09 ha1 lrmd: [5994]: info: cancel_op: operation >> >>> monitor[1162] on stonith::external/safe/ipmi::ha2.itactics.com-stonith >> >>> for client 5997, its parameters: CRM_meta_interval=[20000] >> >>> target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] >> >>> crm_feature_set=[3.0.2] CRM_meta_name=[monitor] >> >>> hostname=[ha2.itactics.com] cancelled >> >>> Oct 6 11:00:09 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1163: stop >> >>> Oct 6 11:00:09 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1164: start >> >>> Oct 6 11:00:12 ha1 lrmd: [5994]: info: stonithRA plugin: got >> >>> metadata: <?xml version="1.0"?> <!DOCTYPE resource-agent SYSTEM >> >>> "ra-api-1.dtd"> <resource-agent name="external/safe/ipmi"> >> >>> <version>1.0</version> <longdesc lang="en"> <!-- no value --> >> >>> </longdesc> <shortdesc lang="en"><!-- no value >> >>> --></shortdesc> <!-- no value --> <actions> <action >> >>> name="start" timeout="15" /> <action name="stop" timeout="15" >> >>> /> <action name="status" timeout="15" /> <action >> >>> name="monitor" timeout="15" interval="15" start-delay="15" /> >> >>> <action name="meta-data" timeout="15" /> </actions> <special >> >>> tag="heartbeat"> <version>2.0</version> </special> >> >>> </resource-agent> >> >>> Oct 6 11:00:12 ha1 lrmd: [5994]: info: G_SIG_dispatch: started at >> >>> 1726679412 should have started at 1726679110 >> >>> Oct 6 11:00:12 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1165: monitor >> >>> Oct 6 11:00:13 ha1 lrmd: [5994]: info: cancel_op: operation >> >>> monitor[1165] on stonith::external/safe/ipmi::ha2.itactics.com-stonith >> >>> for client 5997, its parameters: CRM_meta_interval=[20000] >> >>> target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] >> >>> crm_feature_set=[3.0.2] CRM_meta_name=[monitor] >> >>> hostname=[ha2.itactics.com] cancelled >> >>> Oct 6 11:00:13 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1166: stop >> >>> Oct 6 11:00:13 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1167: start >> >>> Oct 6 11:00:13 ha1 lrmd: [5994]: info: >> >>> rsc:ha2.itactics.com-stonith:1168: monitor >> >>> >> >>> =========================== >> >>> >> >>> >> >>> I'm on a critical path for this release. >> >>> I would really appreciate a quick help on this. >> >>> >> >>> >> >>> >> >>> Thanks a lot. >> >>> >> >>> >> >>> Shravan >> >>> >> >>> _______________________________________________ >> >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >>> >> >>> Project Home: http://www.clusterlabs.org >> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >>> Bugs: >> >>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> >> >> >> _______________________________________________ >> >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> >> >> Project Home: http://www.clusterlabs.org >> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >> Bugs: >> >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> >> >> > > > >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker