Hi, On Thu, Dec 17, 2009 at 09:18:20AM +0100, Andrew Beekhof wrote: > On Wed, Dec 16, 2009 at 5:55 PM, Oscar Remírez de Ganuza Satrústegui > <oscar...@unav.es> wrote: > > [snip] > > > 2. The CRM decided to stop the service. > > Dec 15 20:12:55 herculespre crmd: [8562]: info: do_lrm_rsc_op: Performing > > key=4:1379:0:ae99a943-f4b7-4979-b0c9-09c7f9dd0f9f > > op=mysql-horde-service_stop_0 ) > > Dec 15 20:12:55 herculespre lrmd: [8559]: info: rsc:mysql-horde-service:38: > > stop > > > > 3. The MySQL service received the order and shutted down properly. From > > mysql.log: > > 091215 20:13:14 [Note] /usr/local/etc2/mysql-horde/libexec/mysqld: Normal > > shutdown > > ... > > 091215 20:13:17 [Note] /usr/local/etc2/mysql-horde/libexec/mysqld: Shutdown > > complete > > > > 4. Here comes the problem: the cluster did not received the confirmation > > that the service was properly shutted down: > > Dec 15 20:13:17 herculespre lrmd: [8559]: WARN: mysql-horde-service:stop > > process (PID 12270) timed out (try 1). Killing with signal SIGTERM (15). > > Dec 15 20:13:17 herculespre lrmd: [8559]: WARN: operation stop[38] on > > lsb::mysql-horde::mysql-horde-service for client 8562, its parameters: > > CRM_meta_timeout=[20000] crm_feature_set=[3.0.1] : pid [12270] timed out > > Dec 15 20:13:17 herculespre crmd: [8562]: ERROR: process_lrm_event: LRM > > operation mysql-horde-service_stop_0 (38) Timed Out (timeout=20000ms) > > > > What is happening here?? As it appears in the log, the timeout is suposed to > > be 20s (20000ms), and the service jsut took 3s to shutdown. > > Is it a problem with lrmd? > > Looks like it.
Don't think so. Here's the logs again: Dec 15 20:12:55 herculespre lrmd: [8559]: info: rsc:mysql-horde-service:38: stop lrmd invokes the RA to stop mysql. Whatever happened between this time and the following. 20:13:14 [Note] /usr/local/etc2/mysql-horde/libexec/mysqld: Normal shutdown 20:13:17 [Note] /usr/local/etc2/mysql-horde/libexec/mysqld: Shutdown Dec 15 20:13:17 herculespre lrmd: [8559]: WARN: mysql-horde-service:stop process (PID 12270) timed out (try 1). Killing with signal SIGTERM (15). It could be that you were unlucky here and that the database really took around 20 seconds to shutdown. If it is so, then please increase your timeouts. You also mentioned somewhere that 5s is set for a monitor timeout, that's way to low for any kind of resource. There's a chapter on applications in HA environments in a paper I recently presented (http://tinyurl.com/yg7u4bd). Thanks, Dejan > Given the time of year, it would probably be a good idea to create a > bugzilla entry so that this doesn't get lost. > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker