Hi Greg,

a common and often overlooked reason is to set a feasible
stop action timeout value. If this value is too small
than the stop actions times out which leads to node
stonithing.

Look at resources which might take a long time to stop
properly (even when under load). Only one example: dismounting a
filesystem with many dirty buffers.

Best regards
Andreas Mock



-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Greg Woods
Gesendet: Montag, 22. April 2013 17:51
An: General Linux-HA mailing list
Betreff: Re: [Linux-HA] clean shutdown procedure?

On Mon, 2013-04-22 at 10:12 +1000, Andrew Beekhof wrote:
> On Saturday, April 20, 2013, Greg Woods wrote:
>  Often one of the
> > nodes gets stuck at "Stopping HA Services"
> 
> 
> That means pacemaker is waiting for one of your resources to stop.
> Do you have anything that would take a long time (or fail to stop)?

Not that I am aware of. But some things that came up during this
weekend's powerdown make me think that some of the stop actions are
failing, because setting the stop-all-resources=true property sometimes
caused nodes to be fenced. 

I always dread having to try and find useful information in the
voluminous Pacemaker/Heartbeat logs, but I'll have to try. Of course,
this doesn't happen on the test clusters, and it is hard to debug it
when reproducing it requires creating a service outage on a production
cluster.

--Greg



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to