Hi, On Wed, Oct 06, 2010 at 11:04:34AM -0400, Shravan Mishra wrote: > Hi guys, > > I'm having a weird problem with my stonith resources.They are > constantly starting and stopping. > > I'm using: > > pacemaker=1.1.3 > corosync=1.2.8 > glue=glue_1.0-10
Hmm, which version is this really? Can you do hb_report -V? You should be running at least 1.0.6. > 2.6.29.6-0.6.smp.gcc4.1.x86_64 > > My configuration looks like this: > > ======================= > node ha1.itactics.com > node ha2.itactics.com > primitive ha1.itactics.com-stonith stonith:external/safe/ipmi \ > op monitor interval="20" timeout="180" \ > params target_role="started" hostname="ha1.itactics.com" > ipaddr="192.168.2.3" > primitive ha2.itactics.com-stonith stonith:external/safe/ipmi \ > op monitor interval="20" timeout="180" \ > params target_role="started" hostname="ha2.itactics.com" > ipaddr="192.168.2.7" > location ha1.itactics.com-stonith-placement ha1.itactics.com-stonith \ > rule $id="ri-ha1.itactics.com-stonith-placement-1" -inf: #uname eq > ha1.itactics.com You can reduce such locations: location ha1.itactics.com-stonith-placement ha1.itactics.com-stonith -inf: ha1.itactics.com > location ha2.itactics.com-stonith-placement ha2.itactics.com-stonith \ > rule $id="ri-ha2.itactics.com-stonith-placement-1" -inf: #uname eq > ha2.itactics.com > property $id="cib-bootstrap-options" \ > dc-version="1.1.2-e0d731c2b1be446b27a73327a53067bf6230fb6a" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="true" > ========================= > > An excerpt from /var/log/messages > ========================== > Oct 6 11:00:02 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1150: monitor > Oct 6 11:00:03 ha1 lrmd: [5994]: info: cancel_op: operation > monitor[1150] on stonith::external/safe/ipmi::ha2.itactics.com-stonith > for client 5997, its parameters: CRM_meta_interval=[20000] > target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] > crm_feature_set=[3.0.2] CRM_meta_name=[monitor] > hostname=[ha2.itactics.com] cancelled Hmm, there should be more in the logs. Where do crmd and stonith-ng log? Thanks, Dejan > Oct 6 11:00:03 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1151: > stop > Oct 6 11:00:03 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1152: > start > Oct 6 11:00:03 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1153: monitor > Oct 6 11:00:04 ha1 lrmd: [5994]: info: cancel_op: operation > monitor[1153] on stonith::external/safe/ipmi::ha2.itactics.com-stonith > for client 5997, its parameters: CRM_meta_interval=[20000] > target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] > crm_feature_set=[3.0.2] CRM_meta_name=[monitor] > hostname=[ha2.itactics.com] cancelled > Oct 6 11:00:04 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1154: > stop > Oct 6 11:00:04 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1155: > start > Oct 6 11:00:04 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1156: monitor > Oct 6 11:00:06 ha1 lrmd: [5994]: info: cancel_op: operation > monitor[1156] on stonith::external/safe/ipmi::ha2.itactics.com-stonith > for client 5997, its parameters: CRM_meta_interval=[20000] > target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] > crm_feature_set=[3.0.2] CRM_meta_name=[monitor] > hostname=[ha2.itactics.com] cancelled > Oct 6 11:00:06 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1157: > stop > Oct 6 11:00:06 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1158: > start > Oct 6 11:00:06 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1159: monitor > Oct 6 11:00:07 ha1 lrmd: [5994]: info: cancel_op: operation > monitor[1159] on stonith::external/safe/ipmi::ha2.itactics.com-stonith > for client 5997, its parameters: CRM_meta_interval=[20000] > target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] > crm_feature_set=[3.0.2] CRM_meta_name=[monitor] > hostname=[ha2.itactics.com] cancelled > Oct 6 11:00:07 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1160: > stop > Oct 6 11:00:08 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1161: > start > Oct 6 11:00:08 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1162: monitor > Oct 6 11:00:09 ha1 lrmd: [5994]: info: cancel_op: operation > monitor[1162] on stonith::external/safe/ipmi::ha2.itactics.com-stonith > for client 5997, its parameters: CRM_meta_interval=[20000] > target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] > crm_feature_set=[3.0.2] CRM_meta_name=[monitor] > hostname=[ha2.itactics.com] cancelled > Oct 6 11:00:09 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1163: > stop > Oct 6 11:00:09 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1164: > start > Oct 6 11:00:12 ha1 lrmd: [5994]: info: stonithRA plugin: got > metadata: <?xml version="1.0"?> <!DOCTYPE resource-agent SYSTEM > "ra-api-1.dtd"> <resource-agent name="external/safe/ipmi"> > <version>1.0</version> <longdesc lang="en"> <!-- no value --> > </longdesc> <shortdesc lang="en"><!-- no value > --></shortdesc> <!-- no value --> <actions> <action > name="start" timeout="15" /> <action name="stop" timeout="15" > /> <action name="status" timeout="15" /> <action > name="monitor" timeout="15" interval="15" start-delay="15" /> > <action name="meta-data" timeout="15" /> </actions> <special > tag="heartbeat"> <version>2.0</version> </special> > </resource-agent> > Oct 6 11:00:12 ha1 lrmd: [5994]: info: G_SIG_dispatch: started at > 1726679412 should have started at 1726679110 > Oct 6 11:00:12 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1165: monitor > Oct 6 11:00:13 ha1 lrmd: [5994]: info: cancel_op: operation > monitor[1165] on stonith::external/safe/ipmi::ha2.itactics.com-stonith > for client 5997, its parameters: CRM_meta_interval=[20000] > target_role=[started] ipaddr=[192.168.2.7] CRM_meta_timeout=[180000] > crm_feature_set=[3.0.2] CRM_meta_name=[monitor] > hostname=[ha2.itactics.com] cancelled > Oct 6 11:00:13 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1166: > stop > Oct 6 11:00:13 ha1 lrmd: [5994]: info: rsc:ha2.itactics.com-stonith:1167: > start > Oct 6 11:00:13 ha1 lrmd: [5994]: info: > rsc:ha2.itactics.com-stonith:1168: monitor > > =========================== > > > I'm on a critical path for this release. > I would really appreciate a quick help on this. > > > > Thanks a lot. > > > Shravan > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker