----- Original Message ----- > From: renayama19661...@ybb.ne.jp > To: "PaceMaker-ML" <pacemaker@oss.clusterlabs.org> > Sent: Monday, February 17, 2014 7:06:53 PM > Subject: [Pacemaker] [Problem] Fail-over is delayed.(State transition is not > calculated.) > > Hi All, > > I confirmed movement at the time of the trouble in one of Master/Slave in > Pacemaker1.1.11. > > ------------------------------------- > > Step1) Constitute a cluster. > > [root@srv01 ~]# crm_mon -1 -Af > Last updated: Tue Feb 18 18:07:24 2014 > Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01 > Stack: corosync > Current DC: srv01 (3232238180) - partition with quorum > Version: 1.1.10-9d39a6b > 2 Nodes configured > 6 Resources configured > > > Online: [ srv01 srv02 ] > > vip-master (ocf::heartbeat:Dummy): Started srv01 > vip-rep (ocf::heartbeat:Dummy): Started srv01 > Master/Slave Set: msPostgresql [pgsql] > Masters: [ srv01 ] > Slaves: [ srv02 ] > Clone Set: clnPingd [prmPingd] > Started: [ srv01 srv02 ] > > Node Attributes: > * Node srv01: > + default_ping_set : 100 > + master-pgsql : 10 > * Node srv02: > + default_ping_set : 100 > + master-pgsql : 5 > > Migration summary: > * Node srv01: > * Node srv02: > > Step2) Monitor error in vip-master. > > [root@srv01 ~]# rm -rf /var/run/resource-agents/Dummy-vip-master.state > > [root@srv01 ~]# crm_mon -1 -Af > Last updated: Tue Feb 18 18:07:58 2014 > Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01 > Stack: corosync > Current DC: srv01 (3232238180) - partition with quorum > Version: 1.1.10-9d39a6b > 2 Nodes configured > 6 Resources configured > > > Online: [ srv01 srv02 ] > > Master/Slave Set: msPostgresql [pgsql] > Masters: [ srv01 ] > Slaves: [ srv02 ] > Clone Set: clnPingd [prmPingd] > Started: [ srv01 srv02 ] > > Node Attributes: > * Node srv01: > + default_ping_set : 100 > + master-pgsql : 10 > * Node srv02: > + default_ping_set : 100 > + master-pgsql : 5 > > Migration summary: > * Node srv01: > vip-master: migration-threshold=1 fail-count=1 last-failure='Tue Feb 18 > 18:07:50 2014' > * Node srv02: > > Failed actions: > vip-master_monitor_10000 on srv01 'not running' (7): call=30, > status=complete, last-rc-change='Tue Feb 18 18:07:50 2014', queued=0ms, > exec=0ms > ------------------------------------- > > However, the resource does not fail-over. > > But, fail-over is calculated when I check cib in crm_simulate at this point > in time. > > ------------------------------------- > [root@srv01 ~]# crm_simulate -L -s > > Current cluster status: > Online: [ srv01 srv02 ] > > vip-master (ocf::heartbeat:Dummy): Stopped > vip-rep (ocf::heartbeat:Dummy): Stopped > Master/Slave Set: msPostgresql [pgsql] > Masters: [ srv01 ] > Slaves: [ srv02 ] > Clone Set: clnPingd [prmPingd] > Started: [ srv01 srv02 ] > > Allocation scores: > clone_color: clnPingd allocation score on srv01: 0 > clone_color: clnPingd allocation score on srv02: 0 > clone_color: prmPingd:0 allocation score on srv01: INFINITY > clone_color: prmPingd:0 allocation score on srv02: 0 > clone_color: prmPingd:1 allocation score on srv01: 0 > clone_color: prmPingd:1 allocation score on srv02: INFINITY > native_color: prmPingd:0 allocation score on srv01: INFINITY > native_color: prmPingd:0 allocation score on srv02: 0 > native_color: prmPingd:1 allocation score on srv01: -INFINITY > native_color: prmPingd:1 allocation score on srv02: INFINITY > clone_color: msPostgresql allocation score on srv01: 0 > clone_color: msPostgresql allocation score on srv02: 0 > clone_color: pgsql:0 allocation score on srv01: INFINITY > clone_color: pgsql:0 allocation score on srv02: 0 > clone_color: pgsql:1 allocation score on srv01: 0 > clone_color: pgsql:1 allocation score on srv02: INFINITY > native_color: pgsql:0 allocation score on srv01: INFINITY > native_color: pgsql:0 allocation score on srv02: 0 > native_color: pgsql:1 allocation score on srv01: -INFINITY > native_color: pgsql:1 allocation score on srv02: INFINITY > pgsql:1 promotion score on srv02: 5 > pgsql:0 promotion score on srv01: 1 > native_color: vip-master allocation score on srv01: -INFINITY > native_color: vip-master allocation score on srv02: INFINITY > native_color: vip-rep allocation score on srv01: -INFINITY > native_color: vip-rep allocation score on srv02: INFINITY > > Transition Summary: > * Start vip-master (srv02) > * Start vip-rep (srv02) > * Demote pgsql:0 (Master -> Slave srv01) > * Promote pgsql:1 (Slave -> Master srv02) > > ------------------------------------- > > In addition, fail-over is calculated even if "cluster_recheck_interval" is > carried out. > > Fail-over is carried out even if I carry out cibadmin -B. > > ------------------------------------- > [root@srv01 ~]# cibadmin -B > > [root@srv01 ~]# crm_mon -1 -Af > Last updated: Tue Feb 18 18:21:15 2014 > Last change: Tue Feb 18 18:21:00 2014 via cibadmin on srv01 > Stack: corosync > Current DC: srv01 (3232238180) - partition with quorum > Version: 1.1.10-9d39a6b > 2 Nodes configured > 6 Resources configured > > > Online: [ srv01 srv02 ] > > vip-master (ocf::heartbeat:Dummy): Started srv02 > vip-rep (ocf::heartbeat:Dummy): Started srv02 > Master/Slave Set: msPostgresql [pgsql] > Masters: [ srv02 ] > Slaves: [ srv01 ] > Clone Set: clnPingd [prmPingd] > Started: [ srv01 srv02 ] > > Node Attributes: > * Node srv01: > + default_ping_set : 100 > + master-pgsql : 5 > * Node srv02: > + default_ping_set : 100 > + master-pgsql : 10 > > Migration summary: > * Node srv01: > vip-master: migration-threshold=1 fail-count=1 last-failure='Tue Feb 18 > 18:07:50 2014'
You have resource-stickiness=INFINITY, this is what is preventing the failover from occurring. Set resource-stickiness=1 or 0 and the failover should occur. -- Vossel > * Node srv02: > > Failed actions: > vip-master_monitor_10000 on srv01 'not running' (7): call=30, > status=complete, last-rc-change='Tue Feb 18 18:07:50 2014', queued=0ms, > exec=0ms > > ------------------------------------- > > It is a problem to be behind with practice of fail-over. > I think that the cause that fail-over is late for from error is Pacemaker. > > I registered these contents and log information with Bugzilla. > * http://bugs.clusterlabs.org/show_bug.cgi?id=5197 > > Best Regards, > Hideo Yamauchi. > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org