Hi, When I run pingd + Pacemaker 1.0.12, I can see this message sometimes.
stand_alone_ping: Node XXX.XXX.XXX.XXX is unreachable (read) XXX.XXX.XXX.XXX is pingd's target IP (make sense), or 127.0.0.1 somehow. I found that in cases some applications call "ping" (OS command) without any relation to Pacemaker, pingd manages to pick up their ping's error messages. These packets are not for pingd, so pingd says "unreachable". pingd can retry the next packet and it will work well if there is no network problems. I referred to Linux "ping command" and modified "pingd" to ignore the above message because it's confusable. To sum up, pingd will call "goto retry" if it gets EAGAIN or EINTR. diff --git a/tools/pingd.c b/tools/pingd.c index 5e64ba2..b90d26d 100644 --- a/tools/pingd.c +++ b/tools/pingd.c @@ -862,7 +862,10 @@ ping_read(ping_node *node, int *lenp) if(bytes < 0) { crm_perror(LOG_DEBUG, "Read failed"); - if (saved_errno != EAGAIN && saved_errno != EINTR) { + if (saved_errno == EAGAIN || saved_errno == EINTR) { + crm_info("Retrying..."); + goto retry; + } else { int rc = 0; if(node->type == AF_INET6) { rc = process_icmp6_error(node, (struct sockaddr_in6*)&(node->addr)); @@ -898,6 +901,9 @@ ping_read(ping_node *node, int *lenp) } else if(rc > 0) { crm_free(packet); return TRUE; + } else { + crm_info("Retrying..."); + goto retry; } } else { This is a peculiarly pingd problem, and I know Pacemaker 1.1.x recommends to use ping RA. So if there is no opposition, I'll ask Mori-san to commit this into pacemaker-1.0 repo. Thanks, Junko IKEDA NTT DATA INTELLILINK CORPORATION
pingd.patch
Description: Binary data
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org