----- Original Message ----- > From: "Parshvi" <parshvi...@gmail.com> > To: pacema...@clusterlabs.org > Sent: Thursday, April 19, 2012 6:22:01 AM > Subject: [Pacemaker] start/stop operations fail to happen in parallel on > resources > > Observations: > max-children=30 > total no. of resources=18 > > 1) At a default value 4 of max-children, following logs were observed > that led to monitor op’s timeout for some resources (a total of 18 > rscs): > a. “max_child_count (4) reached, postponing execution of operation > monitor” > b. “WARN: perform_ra_op: the operation operation monitor[18] on > ocf::IPaddr2::ClusterIP for client 3754, stayed in operation list for > 14100 ms (longer than 10000 ms)” > c. SOLUTION: the max-children of lrmd was raised to 30. > d. ISSUES STILL OBSERVED: while 2-3 resources are stuck in start > operation, > if a rsc is issued an explicit start command `crm resource start > rcs1`, then the > start op on this rsc is delayed until any one of the previous > resources exit > from their start operation. >
This is what I would expect to happen. If a operation is in flight at the same time you make a configuration change, I don't believe the change will be looked at until the operation returns or times out. -- Vossel _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org