Hi all We're running a two node cluster with a bunch of OpenVZ Containers as Resources and use SBD as a fencing method. We're still in testing mode and did perform some IO benchmarks on NFS with tiobench. While we were performing those test, the node fenced itself as soon as tiobench was finished doing the test. We looked for the reason and found the following line in the syslog:
--- Apr 2 16:44:07 hostname sbd: [2790]: WARN: Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) --- I assume this could be prevented by setting sbd's -5 flag to a higher value than the default of 3s. But what is a good value? However, the NFS where we were the performing the tests on is provided by a different host than the one that provides the iSCSI device used by SBD. How comes that those two interfere? Next time we monitored the sbd access time during the benchmark with this command: $ while true; do (time sbd -d /dev/sdd list) 2>&1 | grep real; sleep 1; done During the test it was usually ~0.030s. However, just when the test finished, it was much higher, like 2-4s. Actually, we are not so much concerned about this right now, but we would like to make sure, that it is not possible to fence the whole node by a Container doing extensive IO. How can this be safely prevented? Roman _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
