So, the problem is solved.

The solution is very interesting, because nowhere described.

If I follow the advice as to the article http://www.linux-ha.org/wiki/SBD_Fencing and using multipath, I increase timeouts when creating a partition for sbd...

sles2:~ # sbd -d /dev/mapper/SBD dump
Header version     : 2
Number of slots    : 255
Sector size        : 512
Timeout (watchdog) : 120
Timeout (allocate) : 2
Timeout (loop)     : 10
Timeout (msgwait)  : 180

180 and 120 secs is suitable for me.

But... When I creating sbd fencing primitive, I FORGOT to increase its timeout (stonith-timeout)!!!

crm configure primitive sbd_fense stonith:external/sbd params sbd_device="/dev/mapper/SBD" stonith-timeout="240s"

So, openais waits for stonith is about 60 secs (stonith-timeout default value for external/sbd) and kills it:

sles2 stonithd: [5819]: WARN: external_sbd_fense:0_1 process (PID 8688) timed out (try 1). Killing with signal SIGTERM (15). sles2 stonithd: [8688]: info: external_run_cmd: Calling '/usr/lib64/stonith/plugins/external/sbd reset sles1' returned 15 sles2 stonithd: [8688]: CRIT: external_reset_req: 'sbd reset' for host sles1 failed with rc 15 sles2 stonithd: [5819]: debug: Child process external_sbd_fense:0_1 [8688] exited, its exit code: 5 when signo=0
...

Very fun, right?

I really hope that my experience will be useful to someone, and the author of the article will add the recommendations about timeouts for sbd primitive creation.

P.S. Firewall also prevents the launch of resources. Someone can explain me how to run the resources with firewall?

--

С уважением,
ЖОЛДАК Алексей

ICQ   150074
MSN   alek...@zholdak.com
Skype aleksey.zholdak
Voice +380442388043

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Reply via email to