Hi Andrew, Thank you for comment.
> So if I can summarize, you're saying that clnUMdummy02 should not be > allowed to run on srv01 because the combined number of failures is 6 > (and clnUMdummy02 is a non-unique clone). > > And that the current behavior is that clnUMdummy02 continues to run. > > Is that an accurate summary? Yes. > If so, then I agree its a bug. Could you create a bugzilla entry for it > please? I understood. I register this problem in Bugzilla. How about my next question? Is the difference of the replacement of the clone in N1 and N4 specifications? >Of a clone rising in a N1(srv01) node at the time of "globally-unique=false" >is replacing it right? >In addition, is it right movement that replacement does not happen even if a >clone breaks down in a N4(srv04) node? Best Regards, Hideo Yamauchi. --- Andrew Beekhof <and...@beekhof.net> wrote: > 2010/3/12 <renayama19661...@ybb.ne.jp>: > > Hi, > > > > We tested the trouble of the clone. > > > > I confirmed it in the next procedure. > > > > Step1)I start all nodes and update cib.xml. > > > > ============ > > Last updated: Fri Mar 12 14:53:38 2010 > > Stack: openais > > Current DC: srv01 - partition with quorum > > Version: 1.0.7-049006f172774f407e165ec82f7ee09cb73fd0e7 > > 4 Nodes configured, 2 expected votes > > 13 Resources configured. > > ============ > > > > Online: [ srv01 srv02 srv03 srv04 ] > > > > \xA0Resource Group: UMgroup01 > > \xA0 \xA0 UmVIPcheck (ocf::heartbeat:Dummy): Started srv01 > > \xA0 \xA0 UmIPaddr \xA0 (ocf::heartbeat:Dummy): Started srv01 > > \xA0 \xA0 UmDummy01 \xA0(ocf::heartbeat:Dummy): Started srv01 > > \xA0 \xA0 UmDummy02 \xA0(ocf::heartbeat:Dummy): Started srv01 > > \xA0Resource Group: OVDBgroup02-1 > > \xA0 \xA0 prmExPostgreSQLDB1 (ocf::heartbeat:Dummy): Started srv01 > > \xA0 \xA0 prmFsPostgreSQLDB1-1 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv01 > > \xA0 \xA0 prmFsPostgreSQLDB1-2 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv01 > > \xA0 \xA0 prmFsPostgreSQLDB1-3 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv01 > > \xA0 \xA0 prmIpPostgreSQLDB1 (ocf::heartbeat:Dummy): Started srv01 > > \xA0 \xA0 prmApPostgreSQLDB1 (ocf::heartbeat:Dummy): Started srv01 > > \xA0Resource Group: OVDBgroup02-2 > > \xA0 \xA0 prmExPostgreSQLDB2 (ocf::heartbeat:Dummy): Started srv02 > > \xA0 \xA0 prmFsPostgreSQLDB2-1 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv02 > > \xA0 \xA0 prmFsPostgreSQLDB2-2 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv02 > > \xA0 \xA0 prmFsPostgreSQLDB2-3 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv02 > > \xA0 \xA0 prmIpPostgreSQLDB2 (ocf::heartbeat:Dummy): Started srv02 > > \xA0 \xA0 prmApPostgreSQLDB2 (ocf::heartbeat:Dummy): Started srv02 > > \xA0Resource Group: OVDBgroup02-3 > > \xA0 \xA0 prmExPostgreSQLDB3 (ocf::heartbeat:Dummy): Started srv03 > > \xA0 \xA0 prmFsPostgreSQLDB3-1 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv03 > > \xA0 \xA0 prmFsPostgreSQLDB3-2 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv03 > > \xA0 \xA0 prmFsPostgreSQLDB3-3 \xA0 \xA0 \xA0 (ocf::heartbeat:Dummy): > > Started srv03 > > \xA0 \xA0 prmIpPostgreSQLDB3 (ocf::heartbeat:Dummy): Started srv03 > > \xA0 \xA0 prmApPostgreSQLDB3 (ocf::heartbeat:Dummy): Started srv03 > > \xA0Resource Group: grpStonith1 > > \xA0 \xA0 prmStonithN1 \xA0 \xA0 \xA0 (stonith:external/ssh): Started srv04 > > \xA0Resource Group: grpStonith2 > > \xA0 \xA0 prmStonithN2 \xA0 \xA0 \xA0 (stonith:external/ssh): Started srv01 > > \xA0Resource Group: grpStonith3 > > \xA0 \xA0 prmStonithN3 \xA0 \xA0 \xA0 (stonith:external/ssh): Started srv02 > > \xA0Resource Group: grpStonith4 > > \xA0 \xA0 prmStonithN4 \xA0 \xA0 \xA0 (stonith:external/ssh): Started srv03 > > \xA0Clone Set: clnUMgroup01 > > \xA0 \xA0 Started: [ srv01 srv04 ] > > \xA0Clone Set: clnPingd > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > \xA0Clone Set: clnDiskd1 > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > \xA0Clone Set: clnG3dummy1 > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > \xA0Clone Set: clnG3dummy2 > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > > > Step2)I generate the trouble of the clnUMgroup01 clone in a N1(srv01) node. > > > > \xa0[r...@srv01 ~]# rm -rf > > /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02\:0.state > > > > \xA0* The clone resources are replaced. > > > > \xa0[r...@srv01 ~]# ls /var/run/heartbeat/rsctmp/Dummy-clnUMdummy0* > > \xA0/var/run/heartbeat/rsctmp/Dummy-clnUMdummy01:1.state > > /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02:1.state > > > > Step3)Again...I generate the trouble of the clnUMgroup01 clone in a > > N1(srv01) node. > > > > \xa0[r...@srv01 ~]# rm -rf > > /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02\:1.state > > > > \xa0[r...@srv01 ~]# ls /var/run/heartbeat/rsctmp/Dummy-clnUMdummy0* > > \xA0/var/run/heartbeat/rsctmp/Dummy-clnUMdummy01:0.state > > /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02:0.state > > > > \xA0* The clone resources are replaced. > > > > ============ > > Last updated: Fri Mar 12 14:56:19 2010 > > Stack: openais > > Current DC: srv01 - partition with quorum > > Version: 1.0.7-049006f172774f407e165ec82f7ee09cb73fd0e7 > > 4 Nodes configured, 2 expected votes > > 13 Resources configured. > > ============ > > Online: [ srv01 srv02 srv03 srv04 ] > > (snip) > > Migration summary: > > * Node srv02: > > * Node srv03: > > * Node srv04: > > * Node srv01: > > \xA0 clnUMdummy02:0: migration-threshold=5 fail-count=1 > > \xA0 clnUMdummy02:1: migration-threshold=5 fail-count=1 > > > > Step4)I generate the trouble of the clnUMgroup01 clone in a N4(srv04) node. > > > > [r...@srv04 ~]# rm -rf /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02\:1.state > > [r...@srv04 ~]# ls /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02* > > /var/run/heartbeat/rsctmp/Dummy-clnUMdummy02:1.state > > > > \xA0* The clone resources are not replaced. > > > > Step5)Again...I generate the trouble of the clnUMgroup01 clone in a > > N4(srv04) node. > > > > \xA0* The clone resources are not replaced. > > > > (snip) > > Migration summary: > > * Node srv02: > > * Node srv03: > > * Node srv04: > > \xA0 clnUMdummy02:1: migration-threshold=5 fail-count=2 > > * Node srv01: > > \xA0 clnUMdummy02:0: migration-threshold=5 fail-count=1 > > \xA0 clnUMdummy02:1: migration-threshold=5 fail-count=1 > > > > > > Step6)Again...I generate the trouble of the clnUMgroup01 clone in a > > N4(srv04) node and > N1(srv01) node. > > > > \xA0* In the N4 node, trouble of clnUMdummy02 is handled at five times, > > but, in the N1 node, it > is > > processed at much number of times for replacement. > > > > (snip) > > \xA0Clone Set: clnUMgroup01 > > \xA0 \xA0 Started: [ srv01 ] > > \xA0 \xA0 Stopped: [ clnUmResource:1 ] > > \xA0Clone Set: clnPingd > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > \xA0Clone Set: clnDiskd1 > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > \xA0Clone Set: clnG3dummy1 > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > \xA0Clone Set: clnG3dummy2 > > \xA0 \xA0 Started: [ srv01 srv02 srv03 srv04 ] > > > > Migration summary: > > * Node srv02: > > * Node srv03: > > * Node srv04: > > \xA0 clnUMdummy02:1: migration-threshold=5 fail-count=5 > > * Node srv01: > > \xA0 clnUMdummy02:0: migration-threshold=5 fail-count=3 > > \xA0 clnUMdummy02:1: migration-threshold=5 fail-count=3 > > > > Of a clone rising in a N1(srv01) node at the time of > > "globally-unique=false" is replacing it > right? > > In addition, is it right movement that replacement does not happen even if > > a clone breaks down > in a > > N4(srv04) node? > > > > We think that, furthermore, there is a problem because the replacement is > > different. > > > > When it was assumed that the replacement of this clone is right, arrival to > > the trouble number > of > > times is different from a N4(srv04) node in a N1(srv01) node. > > > > By this movement, we cannot set the limit of the trouble number of times of > > the clone well. > > So if I can summarize, you're saying that clnUMdummy02 should not be > allowed to run on srv01 because the combined number of failures is 6 > (and clnUMdummy02 is a non-unique clone). > > And that the current behavior is that clnUMdummy02 continues to run. > > Is that an accurate summary? > If so, then I agree its a bug. Could you create a bugzilla entry for it > please? > > > > > This is specifications or bug? (Or is it already solved in the development > > version?) > > Is setting to operate definitely necessary for cib.xml? > > > > If there is a setting method of right cib.xml, please teach it. > > > > Because the size of the collection of hb_report result is big, I do not > > attach it. > > If there is information of hb_report which is necessary for the solution of > > the problem, give > me > > comments. > > > > Best Regards, > > Hideo Yamauchi. > > > > > > _______________________________________________ > > Pacemaker mailing list > > Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > _______________________________________________ > === 以下のメッセージは省略されました ===
_______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker