Hi 2010/8/2 Dejan Muhamedagic <deja...@fastmail.fm>: > Hi, > > On Mon, Jul 19, 2010 at 07:09:11PM -0300, Diego Woitasen wrote: >> 2010/7/16 Diego Woitasen <di...@woitasen.com.ar>: >> > Hi, >> > I've installed Heartbeat+Pacemaker (3.0.3 and 1.0.9). I have a >> > resource which executes an script to check the service: >> > >> > primitive kolab_imapd ocf:heartbeat:kolab-service \ >> > params service="all" monitor_script="/usr/local/bin/check-imap.py" \ >> > meta migration-threshold="3" failure-timeout="300s" >> > is-managed="true" \ >> > operations $id="operations-imap" \ >> > op monitor interval="20s" timeout="30s" on-fail="restart" \ >> > op start interval="0" timeout="120" \ >> > op stop interval="0" timeout="120" >> > >> > I did I/O stress using bonnie++ and I started to see this message: >> > >> > Jul 16 18:24:38 imapserver lrmd: [4719]: WARN: perform_ra_op: the >> > operation operation monitor[21] on ocf::kolab-service::kolab_imapd for >> > client 4722, its parameters: CRM_meta_interval=[20000] >> > monitor_script=[/usr/local/bin/check-imap.py] >> > CRM_meta_on_fail=[restart] CRM_meta_timeout=[30000] >> > crm_feature_set=[3.0.1] CRM_meta_name=[monitor] service=[all] stayed >> > in operation list for 32740 ms (longer than 10000 ms) >> > >> > The problem is that I've got this messages under High I/O without the >> > stress testing, for example running backups. If I understand that >> > message correctly the monitor operation didn't start, it was waiting >> > on some workqueue to start. > > It was most probably waiting for the previous monitor operation > to finish, though that one should have timed out according to > your configuration. Or there were at least 4 operations on > different resources running on the node. If you expect high load > on the server, you should tune timeouts accordingly.
And what are the correct values for timeout and interval? timeout < interval? > > Thanks, > > Dejan > >> > If I try to execute a command while I'm running the stress it's slow >> > (3 seconds aprox.) but it works. For example, I can run "crm configure >> > show" and the output appears in 3 o 4 seconds. >> > >> > The server have 2 quad-core processors, 6 GB of RAM, running RHEL 5. >> > >> > Regards, >> > Diego >> > >> > -- >> > Diego Woitasen >> > >> >> >> I've rised the priority of the process to 10 and works now. >> >> The documentations says that default rtprio is 5. That's wrong it's 1. >> At least in my pkgs... >> >> Regards, >> Diego >> >> -- >> Diego Woitasen >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > -- Diego Woitasen _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker