Re: [Pacemaker] resources does not start on survied node after reboot

Lars Marowsky-Bree Thu, 31 Oct 2013 10:28:21 -0700

On 2013-10-29T18:12:51, Саша Александров <[email protected]> wrote:


> Oct 29 13:04:21 wcs2 pengine[2362]:  warning: stage6: Scheduling Node wcs1
> for STONITH
> Oct 29 13:04:21 wcs2 crmd[2363]:   notice: te_fence_node: Executing reboot
> fencing operation (53) on wcs1 (timeout=60000)
> Oct 29 13:05:33 wcs2 stonith-ng[2359]:    error: remote_op_done: Operation
> reboot of wcs1 by wcs2 for [email protected]: Timer expired
> Oct 29 13:05:33 wcs2 crmd[2363]:   notice: tengine_stonith_callback:
> Stonith operation 2/53:0:0:f56c4538-1ad8-4871-825e-167eb9304677: Timer

> The node wcs1 is off, should not SBD determine that, and should not the
> cluster start the resources?

The operation times out after about 10s here, there's nothing from sbd
actually being called in the logs.

The most common case is stonith-timeout in the CIB being set too short
for the configured "msgwait" timeout in sbd.

It may be easier to help you if you share your configuration, versions,
and "sbd dump" for the devices configured.


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde


_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] resources does not start on survied node after reboot

Reply via email to