Andrew, Thanks for responding. Comments inline with <Bob>
________________________________ From: Andrew Beekhof <and...@beekhof.net> To: The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org> Cc: Bob Schatz <bsch...@yahoo.com> Sent: Tue, April 12, 2011 11:23:14 PM Subject: Re: [Pacemaker] Question regarding starting of master/slave resources and ELECTIONs On Wed, Apr 13, 2011 at 4:54 AM, Bob Schatz <bsch...@yahoo.com> wrote: > Hi, > I am running Pacemaker 1.0.9 with Heartbeat 3.0.3. > I create 5 master/slave resources in /etc/ha.d/resource.d/startstop during > post-start. I had no idea this was possible. Why would you do this? <Bob> We and I know of a couple of other companies, bundle LinuxHA/Pacemaker into an appliance. For me, when the appliance boots, it creates HA resources based on the hardware it discovers. I assumed that once POST-START was called in the startstop script and we have a DC then the cluster is up and running. I then use "crm" commands to create the configuration, etc. I further assumed that since we have one DC in the cluster then all "crm" commands which modify the configuration would be ordered even if the DC fails over to a different node. Is this incorrect? > I noticed that 4 of the master/slave resources will start right away but the > 5 master/slave resource seems to take a minute or so and I am only running > with one node. > Is this expected? Probably, if the other 4 take around a minute each to start. There is an lrmd config variable that controls how much parallelism it allows (but i forget the name). <Bob> It's max-children and I set it to 40 for this test to see if it would change the behavior. (/sbin/lrmadmin -p max-children 40) > My configuration is below and I have also attached ha-debug. > Also, what triggers a crmd election? Node up/down events and whenever someone replaces the cib (which the shell used to do a lot). <Bob> For my test, I only started one node so that I could avoid node up/down events. I believe the log shows the cib being replaced. Since I am using crm then I assume it must be due to crm. Do the crm_resource, etc commands also replace the cib? Would that avoid elections as a result of cibs being replaced? Thanks, Bob > I seemed to have a lot of elections in > the attached log. I was assuming that on a single node I would only run the > election once in the beginning and then there would not be another one until > a new node joined. > > Thanks, > Bob > > My configuration is: > node $id="856c1f72-7cd1-4906-8183-8be87eef96f2" mgraid-s000030311-1 > primitive SSJ000030312 ocf:omneon:ss \ > params ss_resource="SSJ000030312" > ssconf="/var/omneon/config/config.J000030312" \ > op monitor interval="3s" role="Master" timeout="7s" \ > op monitor interval="10s" role="Slave" timeout="7" \ > op stop interval="0" timeout="20" \ > op start interval="0" timeout="300" > primitive SSJ000030313 ocf:omneon:ss \ > params ss_resource="SSJ000030313" > ssconf="/var/omneon/config/config.J000030313" \ > op monitor interval="3s" role="Master" timeout="7s" \ > op monitor interval="10s" role="Slave" timeout="7" \ > op stop interval="0" timeout="20" \ > op start interval="0" timeout="300" > primitive SSJ000030314 ocf:omneon:ss \ > params ss_resource="SSJ000030314" > ssconf="/var/omneon/config/config.J000030314" \ > op monitor interval="3s" role="Master" timeout="7s" \ > op monitor interval="10s" role="Slave" timeout="7" \ > op stop interval="0" timeout="20" \ > op start interval="0" timeout="300" > primitive SSJ000030315 ocf:omneon:ss \ > params ss_resource="SSJ000030315" > ssconf="/var/omneon/config/config.J000030315" \ > op monitor interval="3s" role="Master" timeout="7s" \ > op monitor interval="10s" role="Slave" timeout="7" \ > op stop interval="0" timeout="20" \ > op start interval="0" timeout="300" > primitive SSS000030311 ocf:omneon:ss \ > params ss_resource="SSS000030311" > ssconf="/var/omneon/config/config.S000030311" \ > op monitor interval="3s" role="Master" timeout="7s" \ > op monitor interval="10s" role="Slave" timeout="7" \ > op stop interval="0" timeout="20" \ > op start interval="0" timeout="300" > primitive icms lsb:S53icms \ > op monitor interval="5s" timeout="7" \ > op start interval="0" timeout="5" > primitive mgraid-stonith stonith:external/mgpstonith \ > params hostlist="mgraid-canister" \ > op monitor interval="0" timeout="20s" > primitive omserver lsb:S49omserver \ > op monitor interval="5s" timeout="7" \ > op start interval="0" timeout="5" > ms ms-SSJ000030312 SSJ000030312 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > ms ms-SSJ000030313 SSJ000030313 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > ms ms-SSJ000030314 SSJ000030314 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > ms ms-SSJ000030315 SSJ000030315 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > ms ms-SSS000030311 SSS000030311 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > clone Fencing mgraid-stonith > clone cloneIcms icms > clone cloneOmserver omserver > location ms-SSJ000030312-master-w1 ms-SSJ000030312 \ > rule $id="ms-SSJ000030312-master-w1-rule" $role="master" 100: #uname > eq mgraid-s000030311-0 > location ms-SSJ000030313-master-w1 ms-SSJ000030313 \ > rule $id="ms-SSJ000030313-master-w1-rule" $role="master" 100: #uname > eq mgraid-s000030311-0 > location ms-SSJ000030314-master-w1 ms-SSJ000030314 \ > rule $id="ms-SSJ000030314-master-w1-rule" $role="master" 100: #uname > eq mgraid-s000030311-0 > location ms-SSJ000030315-master-w1 ms-SSJ000030315 \ > rule $id="ms-SSJ000030315-master-w1-rule" $role="master" 100: #uname > eq mgraid-s000030311-0 > location ms-SSS000030311-master-w1 ms-SSS000030311 \ > rule $id="ms-SSS000030311-master-w1-rule" $role="master" 100: #uname > eq mgraid-s000030311-0 > order orderms-SSJ000030312 0: cloneIcms ms-SSJ000030312 > order orderms-SSJ000030313 0: cloneIcms ms-SSJ000030313 > order orderms-SSJ000030314 0: cloneIcms ms-SSJ000030314 > order orderms-SSJ000030315 0: cloneIcms ms-SSJ000030315 > order orderms-SSS000030311 0: cloneIcms ms-SSS000030311 > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ > cluster-infrastructure="Heartbeat" \ > dc-deadtime="5s" \ > stonith-enabled="true" > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker