I resolved the problem. I found this is a bug in ethmonitor agent. in ethmonitor :
255 # get the link status on $NIC 256 # asks ip about running (up) interfaces, returns the number of matching interface names that are up 257 get_link_status () { 258 $IP2UTIL -o link show up dev "$NIC" | grep -c "$NIC" 259 } The command "ip -o link show up dev eth0 ", just only detect the interface down. but can't detect the link down. So , i guest the developer ,maybe just use command ifdown eth0/bond0 as test. not consider the scene that unplug the cable. Finaly, I decide add the function in IPaddr2. no longer use the agent ethmonitor. I changed monitor fuction of the agent ocf:heartbeat:IPaddr2. 760 ip_monitor() { 761 # TODO: Implement more elaborate monitoring like checking for 762 # interface health maybe via a daemon like FailSafe etc... 763 764 t=$(ip link show "$NIC" | grep -c "state UP") 765 #test $t -ne 1 && return $OCF_ERR_PERM 766 test $t -ne 1 && return $OCF_ERR_PERM 767 so if the nic link down or interface down, the resource will be switch to other node. but u need add the meta to the ocf:heatbeat:IPaddr2. Some like this node sles11264-node1 node sles11264-node2 primitive p_apache lsb:apache2 \ op monitor interval="15" timeout="30" primitive p_vip ocf:heartbeat:IPaddr2 \ params ip="192.168.203.250" nic="eth0" iflabel="0" \ op monitor interval="10" timeout="20" \ meta failure-timeout="5" group g_apache p_vip p_apache \ meta target-role="Started" property $id="cib-bootstrap-options" \ dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="no" \ no-quorum-policy="ignore" \ last-lrm-refresh="1340872994" about meta failure-timeout="5" , you must be careful to set this value. If you set to small, will cause the other side node doesn't have enough time take over. so calculate, set larger. my english is so bad ,i hope so you can understand. If you understand Chinese,you can see my blog. http://linux.52zhe.info/read.php/275.htm On Fri, Jun 29, 2012 at 1:01 PM, kook <kook...@gmail.com> wrote: > For test. I don't know how to reply this subject. > > > On Mon, Jun 25, 2012 at 4:00 PM, kook <kook...@gmail.com> wrote: > >> Dear Fiorenza: >> >> I have the same problem with you. I checked the newest ethmonitor ra >> (ClusterLabs-resource-agents-v3.9.2-0-ge261943.tar). It's same with my sles >> 11 sp2. >> >> Failed actions: >> >> p_ethmonitor:1_monitor_15000 (node=sles11264-node1, call=1591, rc=-2, >> status=Timed Out): unknown exec error >> >> so, can you tell me. how did you solved this problem. Thanks. >> >> liujia >> >> >> >> Il 21/03/2012 09:06, Florian Haas ha scritto: >> >* On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meini<fmeini at esseweb.eu >> ><http://oss.clusterlabs.org/mailman/listinfo/pacemaker>> wrote:*>>* Hi >> >there,*>>* has anybody configured successfully the RA specified in the >> >object of the*>>* message?*>>**>>* I got this error: if_eth0_monitor_0 >> >(node=fw1, call=2297, rc=-2,*>>* status=Timed Out): unknown exec >> >error*>**>* Your ethmonitor RA missed its 50-second timeout on the probe >> >(that is,*>* the initial monitor operation). You should be seeing >> >"Monitoring of*>* if_eth0 failed, X retries left" warnings in your logs. >> >Grepping your*>* syslog for "ethmonitor" will probably turn up some useful >> >results.*>**>* Cheers,*>* Florian*>** >> Thank you, I solved the problem. >> >> Regards >> >> -- >> >> Fiorenza Meini >> Spazio Web S.r.l. >> >> V. Dante Alighieri, 10 - 13900 Biella >> Tel.: 015.2431982 - 015.9526066 >> Fax: 015.2522600 >> Reg. Imprese, CF e P.I.: 02414430021 >> Iscr. REA: BI - 188936 >> Iscr. CCIAA: Biella - 188936 >> Cap. Soc.: 30.000,00 Euro i.v. >> >> >> ---------------------------- >> Side A or B >> > > > > -- > ---------------------------- > 我有一个梦想.呵呵.... > -- ---------------------------- 我有一个梦想.呵呵....
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org