Hi, On Wed, Nov 25, 2009 at 02:37:23PM +0100, Nikola Ciprich wrote: > Hello everybody, > I'm trying to solve following issue: > I've got specific resource type (virtual machine in particular) > which takes quite long to start/stop and those actions cause > considerable load on hosting system. on my cluster we're > running tens of instances of vm resources, and trying to > shutdown pacemaker on node causes it trying to stop many of > those resources in parallel, which causes heavy machine > overload. Then operations start timing out and whole cluster > goes nuts. Is it possible to set some kind of constraint so > that not more than ie 2 parallel actions are executed in time > for vm class resource? I can't group them using group resource, > because some of those can have target-role set to stopped if > they're not needed... Or how can I at least set some global > limit on number of simultaneous actions in general? If > possible, I'd like to limit even the monitor actions so they > run in serial if possible...
Somebody else (Dominik I think) had a similar issue, but can't recall the outcome now. At any rate, it's possible to set the global limit on parallel actions per node in lrmd. It is included in /etc/init.d/openais, but probably not in /etc/init.d/heartbeat. This is how it's set: # lrmadmin -p max-children $LRMD_MAX_CHILDREN The default is 4. A child of the lrmd is actually an RA process running some action (monitor, start, etc). It's a bit more complicated in the init script since we have to make sure that lrmd is ready to serve requests. This is the relevant part: wait_for_lrmd() { local maxwait=30 local i=0 while [ $i -lt $maxwait ]; do test -S /var/run/heartbeat/lrm_cmd_sock >/dev/null 2>&1 && break sleep 1 i=$(($i+1)) done if [ $i -lt $maxwait ]; then return 0 else echo "lrmd apparently didn't start" return 1 fi } set_lrmd_options() { if [ -n "$LRMD_MAX_CHILDREN" ]; then wait_for_lrmd || return $LRMADMIN -p max-children $LRMD_MAX_CHILDREN fi } I'll have that bit added to heartbeat for the next release. Thanks, Dejan > Thanks a lot in advance! > with best regards > nik > > -- > ------------------------------------- > Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28. rijna 168, 709 01 Ostrava > > tel.: +420 596 603 142 > fax: +420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > ------------------------------------- > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker