On 13 Jun 2014, at 5:35 pm, kamal kishi <kamal.ki...@gmail.com> wrote:
> Hi Andrew, > Checked the logs and I felt OCFS2 taking time to recover, can anyone please > verify my log and confirm if I'm correct. I see: Jun 13 17:48:57 server2 crmd: [1840]: info: do_lrm_rsc_op: Performing key=6:111:0:68dbc62f-5255-4a32-915b-fbb3954c6092 op=resXen1_start_0 ) ... Jun 13 17:50:14 server2 lrmd: [1837]: info: RA output: (resXen1:start:stdout) Using config file "/home/cluster/xen/win7.cfg".#012Started domain xenwin7 (id=5) Jun 13 17:50:16 server2 lrmd: [1837]: info: operation start[115] on resXen1 for client 1840: pid 20673 exited with return code 0 Which looks like Xen taking 70s to start which is almost the entire period covered by the logs. > > And if OCFS2 is the reason for delay in failover may I know a way to reduce > that delay caused. > > Attached is my syslog and pacemaker configuration > > Looking forward for a solution > > > On Fri, Jun 13, 2014 at 8:55 AM, kamal kishi <kamal.ki...@gmail.com> wrote: > Fine Andrew, will check it out but does the timeouts provided for pacemaker > affect this?? > Which part of the time configuration will be considered by pacemaker to > decide if the other node is actually down and the resources should be taken > over by it. > > And Alexis, I'm not facing any issue while putting node to standby mode. > I'm using DRBD 8.3.11 (apt-get install drbd8-utils=2:8.3.11-0ubuntu1) > Had to force the download to particular version as the current download/patch > is not compatible with pacemaker. > You too try to install 8.3.11 and check once, all the best > > > On Fri, Jun 13, 2014 at 5:22 AM, Andrew Beekhof <and...@beekhof.net> wrote: > > On 12 Jun 2014, at 9:15 pm, kamal kishi <kamal.ki...@gmail.com> wrote: > > > Hi All, > > > > This might be a basic question but I'm not sure whats taking time for > > failover switching. > > Hope anyone can figure it out. > > How about looking in the logs and seeing when the various stop/start actions > occur and which ones take the longest? > > > > > Scenario - > > Pacemaker running DRBD(Dual primary mode)+OCFS2+XEN for Virtual windows > > machine > > > > Pacemaker startup starts - > > DRBD -> OCFS2 -> XEN > > Lets consider under Server1 - DRBD, OCFS2(clone) and XEN are started > > > > Server2 - DRBD, OCFS2(clone) are started > > > > Now if Server1 power is OFF > > > > The XEN resource which was running under Server1 should be failed over to > > Server2. > > > > In my case, its taking almost 90 to 110 seconds to do this. > > > > Can anyone suggest me ways to reduce it to within 30 to 40 seconds > > > > My pacemaker configuration is - > > crm configure > > property no-quorum-policy=ignore > > property stonith-enabled=false > > property default-resource-stickiness=1000 > > > > primitive resDRBDr1 ocf:linbit:drbd \ > > params drbd_resource="r0" \ > > op start interval="0" timeout="240s" \ > > op stop interval="0" timeout="100s" \ > > op monitor interval="20s" role="Master" timeout="240s" \ > > op monitor interval="30s" role="Slave" timeout="240s" \ > > meta migration-threshold="3" failure-timeout="60s" > > primitive resOCFS2r1 ocf:heartbeat:Filesystem \ > > params device="/dev/drbd/by-res/r0" directory="/cluster" fstype="ocfs2" \ > > op monitor interval="10s" timeout="60s" \ > > op start interval="0" timeout="90s" \ > > op stop interval="0" timeout="60s" \ > > meta migration-threshold="3" failure-timeout="60s" > > primitive resXen1 ocf:heartbeat:Xen \ > > params xmfile="/home/cluster/xen/win7.cfg" name="xenwin7" \ > > op monitor interval="20s" timeout="60s" \ > > op start interval="0" timeout="90s" \ > > op stop interval="0" timeout="60s" \ > > op migrate_from interval="0" timeout="120s" \ > > op migrate_to interval="0" timeout="120s" \ > > meta allow-migrate="true" target-role="started" > > > > ms msDRBDr1 resDRBDr1 \ > > meta notify="true" master-max="2" interleave="true" target-role="Started" > > clone cloOCFS2r1 resOCFS2r1 \ > > meta interleave="true" ordered="true" target-role="Started" > > > > colocation colOCFS12-with-DRBDrMaster inf: cloOCFS2r1 msDRBDr1:Master > > colocation colXen-with-OCFSr1 inf: resXen1 cloOCFS2r1 > > order ordDRBD-before-OCFSr1 inf: msDRBDr1:promote cloOCFS2r1:start > > order ordOCFS2r1-before-Xen1 inf: cloOCFS2r1:start resXen1:start > > > > commit > > bye > > > > -- > > Regards, > > Kamal Kishore B V > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > > > -- > Regards, > Kamal Kishore B V > > > > -- > Regards, > Kamal Kishore B V > <Pacemaker.txt><syslog.txt>_______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org