Hi, Dejan Muhamedagic escribió:
Hi,On Thu, Dec 17, 2009 at 09:18:20AM +0100, Andrew Beekhof wrote:On Wed, Dec 16, 2009 at 5:55 PM, Oscar Remírez de Ganuza Satrústegui <oscar...@unav.es> wrote: [snip]What is happening here?? As it appears in the log, the timeout is suposed to be 20s (20000ms), and the service jsut took 3s to shutdown. Is it a problem with lrmd?Looks like it.Don't think so. Here's the logs again: Dec 15 20:12:55 herculespre lrmd: [8559]: info: rsc:mysql-horde-service:38: stop lrmd invokes the RA to stop mysql. Whatever happened between this time and the following. 20:13:14 [Note] /usr/local/etc2/mysql-horde/libexec/mysqld: Normal shutdown 20:13:17 [Note] /usr/local/etc2/mysql-horde/libexec/mysqld: Shutdown Dec 15 20:13:17 herculespre lrmd: [8559]: WARN: mysql-horde-service:stop process (PID 12270) timed out (try 1). Killing with signal SIGTERM (15). It could be that you were unlucky here and that the database really took around 20 seconds to shutdown. If it is so, then
Oh, thanks! You are right!The command to shutdown the mysql resource was sent at 20:12:55, but the mysql service did not start shutting down until 20:13:14, finishing at 20:13:17, (22 seconds > timeout (20 s))
How is it possible to change the timeout for start or stop operations?
We had configured very low timeout for the monitors too. When I tried today to change them, even the crm alerted me and advised me:please increase your timeouts. You also mentioned somewhere that 5s is set for a monitor timeout, that's way to low for any kind of resource. There's a chapter on applications in HA environments in a paper I recently presented (http://tinyurl.com/yg7u4bd).
crm(live)# configure editWARNING: mysql-horde-nfs: timeout 10s for monitor_0 is smaller than the advised 40 WARNING: mysql-horde-service: timeout 10s for monitor_0 is smaller than the advised 15
WARNING: pingd: timeout 10s for monitor_0 is smaller than the advised 20I have read your paper and understand the importance of tunning correctly the timeout values, in order not to cause false positives and unavailabilities.
Just two last questions:Is it 'normal' to set a resource as "unmanaged" just because the stop operation was timed out once? Is it possible to configure the cluster to try more than once to stop a resource? (as it is possible to do for the start operation with the cluster property start-failure-is-fatal="false")
Thank you very much for your help! I really appreciate it! Regards, --- Oscar Remírez de Ganuza Servicios Informáticos Universidad de Navarra Ed. de Derecho, Campus Universitario 31080 Pamplona (Navarra), Spain tfno: +34 948 425600 Ext. 3130http://www.unav.es/SI
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker