Hi, On Wed, Aug 11, 2010 at 05:17:03PM -0300, Diego Woitasen wrote: > Hi > > 2010/8/2 Dejan Muhamedagic <deja...@fastmail.fm>: > > Hi, > > > > On Mon, Jul 19, 2010 at 07:09:11PM -0300, Diego Woitasen wrote: > >> 2010/7/16 Diego Woitasen <di...@woitasen.com.ar>: > >> > Hi, > >> > I've installed Heartbeat+Pacemaker (3.0.3 and 1.0.9). I have a > >> > resource which executes an script to check the service: > >> > > >> > primitive kolab_imapd ocf:heartbeat:kolab-service \ > >> > params service="all" > >> > monitor_script="/usr/local/bin/check-imap.py" \ > >> > meta migration-threshold="3" failure-timeout="300s" > >> > is-managed="true" \ > >> > operations $id="operations-imap" \ > >> > op monitor interval="20s" timeout="30s" on-fail="restart" \ > >> > op start interval="0" timeout="120" \ > >> > op stop interval="0" timeout="120" > >> > > >> > I did I/O stress using bonnie++ and I started to see this message: > >> > > >> > Jul 16 18:24:38 imapserver lrmd: [4719]: WARN: perform_ra_op: the > >> > operation operation monitor[21] on ocf::kolab-service::kolab_imapd for > >> > client 4722, its parameters: CRM_meta_interval=[20000] > >> > monitor_script=[/usr/local/bin/check-imap.py] > >> > CRM_meta_on_fail=[restart] CRM_meta_timeout=[30000] > >> > crm_feature_set=[3.0.1] CRM_meta_name=[monitor] service=[all] stayed > >> > in operation list for 32740 ms (longer than 10000 ms) > >> > > >> > The problem is that I've got this messages under High I/O without the > >> > stress testing, for example running backups. If I understand that > >> > message correctly the monitor operation didn't start, it was waiting > >> > on some workqueue to start. > > > > It was most probably waiting for the previous monitor operation > > to finish, though that one should have timed out according to > > your configuration. Or there were at least 4 operations on > > different resources running on the node. If you expect high load > > on the server, you should tune timeouts accordingly. > > And what are the correct values for timeout and interval?
Depends on your resources. And the possible load. Perhaps bonnie is not the right tool to stress the hosts, i.e. I doubt that you'll run into such a high disk load for such a long period of time. I don't know what is kolab_imapd and how heavy/deep is the monitor operation. > timeout < interval? The two are independent. The interval countdown starts when the previous monitor finished. Thanks, Dejan > > > > Thanks, > > > > Dejan > > > >> > If I try to execute a command while I'm running the stress it's slow > >> > (3 seconds aprox.) but it works. For example, I can run "crm configure > >> > show" and the output appears in 3 o 4 seconds. > >> > > >> > The server have 2 quad-core processors, 6 GB of RAM, running RHEL 5. > >> > > >> > Regards, > >> > Diego > >> > > >> > -- > >> > Diego Woitasen > >> > > >> > >> > >> I've rised the priority of the process to 10 and works now. > >> > >> The documentations says that default rtprio is 5. That's wrong it's 1. > >> At least in my pkgs... > >> > >> Regards, > >> Diego > >> > >> -- > >> Diego Woitasen > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: > >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > > > > -- > Diego Woitasen > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker