Does anybody have a clue what is going on - is this a bug or a real problem 
with connection that is not notified by system ping.

Is there any way to replace pingd with e.g. a bash script that check the 
connectivity and reports a connection status to the heartbeat system (e.g. 
resource is stopped or resource has failure)? So a score for a certain resource 
group is recalculated and in the case of connectivity problems the resource 
group is relocated to other machine. Is this practically possible to apply to 
the crm style configuration?

E.g. bash subroutine:

check_connection () {
?node=$1
?[ -z "$node" ] && return 1
?NPACKETS=3
?stat=0
?ping -n -q -c $NPACKETS "$node" >/dev/null 2>&1 
?if [ "$?" -ne 0 ]; then
??echo "ERROR: Ping node $node does not answer to ICMP pings"
??stat=1
?else
?       echo "INFO: Ping node $node answers to ICMP pings"
?fi
?return $stat
}

 
I would be grateful for help,

Jarek


"General Linux-HA mailing list" <[email protected]> napisał(a): 
 > 
 > I found additionally the error message attached below. Please advise.
 > 
 > Thanks
 > Jarek
 > 
 > pingd[6890]: 2009/07/24_14:47:15 debug: stand_alone_ping: Node 3.27.60.1 is 
 > alive
 > pingd[6890]: 2009/07/24_14:47:15 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:15 debug: send_update: Sent update: pingd=1000 
 > (1 active ping nodes)
 > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: stand_alone_ping: Checking 
 > connectivity
 > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Got address 
 > 3.27.60.1 for 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Opened connection 
 > to 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_write: Sent 39 bytes to 
 > 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_read: Got 59 bytes
 > No error message: -1: Resource temporarily unavailable (11)
 > pingd[6890]: 2009/07/24_14:47:16 debug: process_icmp_error: No error 
 > message: -1: Resource temporarily unavailable (11)
 > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: dump_v4_echo: Echo from 
 > 3.27.60.1 (exp=1238, seq=18367, id=11669, dest=3.
 > 27.60.1, data=pingd-v4): Echo Reply
 > pingd[6890]: 2009/07/24_14:47:16 info: stand_alone_ping: Node 3.27.60.1 is 
 > unreachable (read)
 > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_write: Sent 39 bytes to 
 > 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_read: Got 59 bytes
 > No error message: -1: Resource temporarily unavailable (11)
 > pingd[6890]: 2009/07/24_14:47:17 debug: process_icmp_error: No error 
 > message: -1: Resource temporarily unavailable (11)
 > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: dump_v4_echo: Echo from 
 > 3.27.60.1 (exp=1239, seq=1238, id=6890, dest=3.27
 > .60.1, data=pingd-v4): Echo Reply
 > pingd[6890]: 2009/07/24_14:47:17 info: stand_alone_ping: Node 3.27.60.1 is 
 > unreachable (read)
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:18 debug: send_update: Sent update: pingd=0 (0 
 > active ping nodes)
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: stand_alone_ping: Checking 
 > connectivity
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Got address 
 > 3.27.60.1 for 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Opened connection 
 > to 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_write: Sent 39 bytes to 
 > 3.27.60.1
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_read: Got 59 bytes
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: dump_v4_echo: Echo from 
 > 3.27.60.1 (exp=1240, seq=1240, id=6890, dest=3.27
 > .60.1, data=pingd-v4): Echo Reply
 > pingd[6890]: 2009/07/24_14:47:18 debug: stand_alone_ping: Node 3.27.60.1 is 
 > alive
 > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 > p
 >  
 > "General Linux-HA mailing list" <[email protected]> napisał(a): 
 >  > Below is part of the output with error message produced by command:
 >  > /usr/lib64/heartbeat/pingd -VVV -a pingd -d 10 -m 1000 -h 3.27.60.1
 >  > 
 >  > The machine has three network interfaces and is connected to three 
 > different subnets (3.27.x.x, 192.168.x.x - cluster subnet, 172.22.x.x - 
 > dedicated for heartbeat).
 >  > 
 >  > pingd[6890]: 2009/07/24_14:44:36 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:36 debug: send_update: Sent update: 
 > pingd=1000 (1 active ping nodes)
 >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: stand_alone_ping: 
 > Checking connectivity
 >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Got address 
 > 3.27.60.1 for 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Opened 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_write: Sent 39 bytes 
 > to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_read: Got 59 bytes
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: dump_v4_echo: Echo from 
 > 3.27.60.1 (exp=1080, seq=1080, id=6890, dest=3.27.60.1, data=pingd-v4): Echo 
 > Reply
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: stand_alone_ping: Node 3.27.60.1 
 > is alive
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: send_update: Sent update: 
 > pingd=1000 (1 active ping nodes)
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: stand_alone_ping: 
 > Checking connectivity
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Got address 
 > 3.27.60.1 for 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Opened 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_write: Sent 39 bytes 
 > to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: ping_read: Got 262 bytes
 >  > No error message: -1: Resource temporarily unavailable (11)
 >  > pingd[6890]: 2009/07/24_14:44:39 debug: process_icmp_error: No error 
 > message: -1: Resource temporarily unavailable (11)
 >  > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: dump_v4_echo: Echo from 
 > 172.22.10.2 (exp=1081, seq=0, id=0, dest=3.27.60.1, data=E?): Unreachable 
 > Port
 >  > pingd[6890]: 2009/07/24_14:44:39 info: stand_alone_ping: Node 3.27.60.1 
 > is unreachable (read)
 >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_write: Sent 39 bytes 
 > to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_read: Got 262 bytes
 >  > No error message: -1: Resource temporarily unavailable (11)
 >  > pingd[6890]: 2009/07/24_14:44:40 debug: process_icmp_error: No error 
 > message: -1: Resource temporarily unavailable (11)
 >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: dump_v4_echo: Echo from 
 > 192.168.0.5 (exp=1082, seq=0, id=0, dest=3.27.60.1, data=E?): Unreachable 
 > Port
 >  > pingd[6890]: 2009/07/24_14:44:40 info: stand_alone_ping: Node 3.27.60.1 
 > is unreachable (read)
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update: pingd=0 
 > (0 active ping nodes)
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: stand_alone_ping: 
 > Checking connectivity
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Got address 
 > 3.27.60.1 for 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Opened 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_write: Sent 39 bytes 
 > to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_read: Got 59 bytes
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: dump_v4_echo: Echo from 
 > 3.27.60.1 (exp=1083, seq=1083, id=6890, dest=3.27.60.1, data=pingd-v4): Echo 
 > Reply
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: stand_alone_ping: Node 3.27.60.1 
 > is alive
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed 
 > connection to 3.27.60.1
 >  > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update: 
 > pingd=1000 (1 active ping nodes)
 >  > 
 >  > Thanks
 >  > Jarek
 >  >  
 >  > "General Linux-HA mailing list" <[email protected]> napisał(a): 
 >  >  > 2009/7/24  <[email protected]>:
 >  >  > >
 >  >  > > Rpm built for RHEL5:
 >  >  > > heartbeat-common-2.99.2-8.1
 >  >  > > libheartbeat2-2.99.2-8.1
 >  >  > > heartbeat-2.99.2-8.1
 >  >  > > heartbeat-resources-2.99.2-8.1
 >  >  > > pacemaker-1.0.3-2.2
 >  >  > > pacemaker-mgmt-client-1.99.1-2.1
 >  >  > > libpacemaker3-1.0.3-2.2
 >  >  > > pacemaker-mgmt-1.99.1-2.1
 >  >  > >
 >  >  > > If i start pingd manually (beside working heartbeat+pacemaker) it 
 > gives me following when in /var/log/ha-debug appears "stand_alone_ping: Node 
 > xx.yy.zz.ww is unreachable (read)":
 >  >  > >
 >  >  > > [r...@gate2]# date ;/usr/lib64/heartbeat/pingd -a pingd -d 10 -m 
 > 1000 -h xx.yy.zz.ww; date
 >  >  > > Thu Jul 23 19:25:24 CEST 2009
 >  >  > > No error message: -1: Resource temporarily unavailable (11)
 >  >  > > No error message: -1: Resource temporarily unavailable (11)
 >  >  > > No error message: -1: Resource temporarily unavailable (11)
 >  >  > > No error message: -1: Resource temporarily unavailable (11)
 >  >  > > No error message: -1: Resource temporarily unavailable (11)
 >  >  > > No error message: -1: Resource temporarily unavailable (11)
 >  >  > > ...
 >  >  > >
 >  >  > > System ping reports no errors.
 >  >  > >
 >  >  > 
 >  >  > If you repeat that test with some extra -V arguments, you should see
 >  >  > more information (which would be helpful).
 >  >  > But its pretty clear there must be a bug, so its probably worth
 >  >  > creating an entry in bugzilla.
 >  >  > _______________________________________________
 >  >  > Linux-HA mailing list
 >  >  > [email protected]
 >  >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
 >  >  > See also: http://linux-ha.org/ReportingProblems
 >  > 
 >  > _______________________________________________
 >  > Linux-HA mailing list
 >  > [email protected]
 >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
 >  > See also: http://linux-ha.org/ReportingProblems
 > 
 > _______________________________________________
 > Linux-HA mailing list
 > [email protected]
 > http://lists.linux-ha.org/mailman/listinfo/linux-ha
 > See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to