On 13/05/2013, at 4:14 PM, renayama19661...@ybb.ne.jp wrote: > Hi All, > > We constituted a simple cluster in environment of vSphere5.1. > > We composed it of two ESXi servers and shared disk. > > The guest located it to the shared disk.
What is on the shared disk? The whole OS or app-specific data (i.e. nothing pacemaker needs directly)? > > > Step 1) Constitute a cluster.(A DC node is an active node.) > > ============ > Last updated: Mon May 13 14:16:09 2013 > Stack: Heartbeat > Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with > quorum > Version: 1.0.13-30bb726 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > ============ > > Online: [ pgsr01 pgsr02 ] > > Resource Group: test-group > Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > Clone Set: clnPingd > Started: [ pgsr01 pgsr02 ] > > Node Attributes: > * Node pgsr01: > + default_ping_set : 100 > * Node pgsr02: > + default_ping_set : 100 > > Migration summary: > * Node pgsr01: > * Node pgsr02: > > > Step 2) Strace does the pengine process of the DC node. > > [root@pgsr01 ~]# ps -ef |grep heartbeat > root 2072 1 0 13:56 ? 00:00:00 heartbeat: master control > process > root 2075 2072 0 13:56 ? 00:00:00 heartbeat: FIFO reader > root 2076 2072 0 13:56 ? 00:00:00 heartbeat: write: bcast eth1 > root 2077 2072 0 13:56 ? 00:00:00 heartbeat: read: bcast eth1 > root 2078 2072 0 13:56 ? 00:00:00 heartbeat: write: bcast eth2 > root 2079 2072 0 13:56 ? 00:00:00 heartbeat: read: bcast eth2 > 496 2082 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/ccm > 496 2083 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/cib > root 2084 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/lrmd -r > root 2085 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/stonithd > 496 2086 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/attrd > 496 2087 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/crmd > 496 2089 2087 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/pengine > root 2182 1 0 14:15 ? 00:00:00 /usr/lib64/heartbeat/pingd -D > -p /var/run//pingd-default_ping_set -a default_ping_set -d 5s -m 100 -i 1 -h > 192.168.101.254 > root 2287 1973 0 14:16 pts/0 00:00:00 grep heartbea > > [root@pgsr01 ~]# strace -p 2089 > Process 2089 attached - interrupt to quit > restart_syscall(<... resuming interrupted call ...>) = 0 > times({tms_utime=5, tms_stime=6, tms_cutime=0, tms_cstime=0}) = 429527557 > recvfrom(5, 0xa93ff7, 953, 64, 0, 0) = -1 EAGAIN (Resource temporarily > unavailable) > poll([{fd=5, events=0}], 1, 0) = 0 (Timeout) > recvfrom(5, 0xa93ff7, 953, 64, 0, 0) = -1 EAGAIN (Resource temporarily > unavailable) > poll([{fd=5, events=0}], 1, 0) = 0 (Timeout) > (snip) > > > Step 3) Disconnect the shared disk which an active node was placed. > > Step 4) Cut off pingd of the standby node. > The score of pingd is reflected definitely, but handling of pengine > blocks it. > > ~ # esxcfg-vswitch -N vmnic1 -p "ap-db" vSwitch1 > ~ # esxcfg-vswitch -N vmnic2 -p "ap-db" vSwitch1 > > > (snip) > brk(0xd05000) = 0xd05000 > brk(0xeed000) = 0xeed000 > brk(0xf2d000) = 0xf2d000 > fstat(6, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x7f86a255a000 > write(6, "BZh51AY&SY\327\373\370\203\0\t(_\200UPX\3\377\377%cT > \277\377\377"..., 2243) = 2243 > brk(0xb1d000) = 0xb1d000 > fsync(6 ------------------------------> BLOCKED > (snip) > > > ============ > Last updated: Mon May 13 14:19:15 2013 > Stack: Heartbeat > Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with > quorum > Version: 1.0.13-30bb726 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > ============ > > Online: [ pgsr01 pgsr02 ] > > Resource Group: test-group > Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > Clone Set: clnPingd > Started: [ pgsr01 pgsr02 ] > > Node Attributes: > * Node pgsr01: > + default_ping_set : 100 > * Node pgsr02: > + default_ping_set : 0 : Connectivity is lost > > Migration summary: > * Node pgsr01: > * Node pgsr02: > > > Step 4) Reconnect communication of pingd of the standby node. > The score of pingd is reflected definitely, but handling of pengine > blocks it. > > > ~ # esxcfg-vswitch -M vmnic1 -p "ap-db" vSwitch1 > ~ # esxcfg-vswitch -M vmnic2 -p "ap-db" vSwitch1 > > ============ > Last updated: Mon May 13 14:19:40 2013 > Stack: Heartbeat > Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with > quorum > Version: 1.0.13-30bb726 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > ============ > > Online: [ pgsr01 pgsr02 ] > > Resource Group: test-group > Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > Clone Set: clnPingd > Started: [ pgsr01 pgsr02 ] > > Node Attributes: > * Node pgsr01: > + default_ping_set : 100 > * Node pgsr02: > + default_ping_set : 100 > > Migration summary: > * Node pgsr01: > * Node pgsr02: > > > --------- A block state of pengine continues ----- > > Step 5) Cut off pingd of the active node. > The score of pingd is reflected definitely, but handling of pengine > blocks it. > > > ~ # esxcfg-vswitch -N vmnic1 -p "ap-db" vSwitch1 > ~ # esxcfg-vswitch -N vmnic2 -p "ap-db" vSwitch1 > > > ============ > Last updated: Mon May 13 14:20:32 2013 > Stack: Heartbeat > Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with > quorum > Version: 1.0.13-30bb726 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > ============ > > Online: [ pgsr01 pgsr02 ] > > Resource Group: test-group > Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > Clone Set: clnPingd > Started: [ pgsr01 pgsr02 ] > > Node Attributes: > * Node pgsr01: > + default_ping_set : 0 : Connectivity is lost > * Node pgsr02: > + default_ping_set : 100 > > Migration summary: > * Node pgsr01: > * Node pgsr02: > > --------- A block state of pengine continues ----- > > > After that the movement to the standby node of the resource does not happen > because in condition transition is not made because a block state of pengine > continues. > In the vSphere environment, time considerably passes, and blocking is > canceled, and transition is generated. > * The IO blocking of pengine seems to occur repeatedly > * Other processes may be blocked, too. > * It took it from trouble to FO completion more than one hour. > > This problem shows that resource movement may not occur after disk trouble in > vSphere environment. > > Because our user thinks that I use Pacemaker in vSphere environment, the > solution to this problem is necessary. > > Do not you know the example which solved a similar problem on vSphere? > > We think that it is necessary to evade a block of pengine if there is not a > solution example. > > For example... > 1. crmd watches a request to pengine with a timer... > 2. pengine writes in it with a timer and watches processing.... > ..etc... > > * This problem does not seem to occur in KVM. > * There is the possibility of the difference of the hyper visor. > * In addition, even an actual machine of Linux did not generate the problem. > > > Best Regards, > Hideo Yamauchi. > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org