Not sure whether I'm doing this the right way but here goes.. With resources started on node #1:
# crm_simulate -L -s -d pgdbsrv01.cl1.local Current cluster status: Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ] Master/Slave Set: DRBD_ms_data01 [DRBD_data01] Masters: [ pgdbsrv01.cl1.local ] Slaves: [ pgdbsrv02.cl1.local ] Master/Slave Set: DRBD_ms_data02 [DRBD_data02] Masters: [ pgdbsrv01.cl1.local ] Slaves: [ pgdbsrv02.cl1.local ] Resource Group: GRP_data01 LVM_vgdata01 (ocf::heartbeat:LVM): Started pgdbsrv01.cl1.local FS_data01 (ocf::heartbeat:Filesystem): Started pgdbsrv01.cl1.local Resource Group: GRP_data02 LVM_vgdata02 (ocf::heartbeat:LVM): Started pgdbsrv01.cl1.local FS_data02 (ocf::heartbeat:Filesystem): Started pgdbsrv01.cl1.local fusion-fencing (stonith:fence_fusion): Started pgdbsrv02.cl1.local Performing requested modifications + Taking node pgdbsrv01.cl1.local offline Allocation scores: clone_color: DRBD_ms_data01 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_ms_data01 allocation score on pgdbsrv02.cl1.local: 0 clone_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000 clone_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: -INFINITY DRBD_data01:0 promotion score on pgdbsrv02.cl1.local: 10000 DRBD_data01:1 promotion score on none: 0 clone_color: DRBD_ms_data02 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_ms_data02 allocation score on pgdbsrv02.cl1.local: 0 clone_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000 clone_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: -INFINITY DRBD_data02:0 promotion score on pgdbsrv02.cl1.local: 10000 DRBD_data02:1 promotion score on none: 0 group_color: GRP_data01 allocation score on pgdbsrv01.cl1.local: 0 group_color: GRP_data01 allocation score on pgdbsrv02.cl1.local: 0 group_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: 0 group_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 0 group_color: FS_data01 allocation score on pgdbsrv01.cl1.local: 0 group_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0 native_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 10000 native_color: FS_data01 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0 group_color: GRP_data02 allocation score on pgdbsrv01.cl1.local: 0 group_color: GRP_data02 allocation score on pgdbsrv02.cl1.local: 0 group_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: 0 group_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 0 group_color: FS_data02 allocation score on pgdbsrv01.cl1.local: 0 group_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0 native_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 10000 native_color: FS_data02 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0 native_color: fusion-fencing allocation score on pgdbsrv01.cl1.local: 0 native_color: fusion-fencing allocation score on pgdbsrv02.cl1.local: 0 Transition Summary: * Promote DRBD_data01:0 (Slave -> Master pgdbsrv02.cl1.local) * Demote DRBD_data01:1 (Master -> Stopped pgdbsrv01.cl1.local) * Promote DRBD_data02:0 (Slave -> Master pgdbsrv02.cl1.local) * Demote DRBD_data02:1 (Master -> Stopped pgdbsrv01.cl1.local) * Move LVM_vgdata01 (Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local) * Move FS_data01 (Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local) * Move LVM_vgdata02 (Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local) * Move FS_data02 (Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local) ..taking node #1 offline (standby) for real, resources running on node #2, then: # crm_simulate -L -s -u pgdbsrv01.cl1.local Current cluster status: Node pgdbsrv01.cl1.local: standby Online: [ pgdbsrv02.cl1.local ] Master/Slave Set: DRBD_ms_data01 [DRBD_data01] Masters: [ pgdbsrv02.cl1.local ] Stopped: [ DRBD_data01:1 ] Master/Slave Set: DRBD_ms_data02 [DRBD_data02] Masters: [ pgdbsrv02.cl1.local ] Stopped: [ DRBD_data02:1 ] Resource Group: GRP_data01 LVM_vgdata01 (ocf::heartbeat:LVM): Started pgdbsrv02.cl1.local FS_data01 (ocf::heartbeat:Filesystem): Started pgdbsrv02.cl1.local Resource Group: GRP_data02 LVM_vgdata02 (ocf::heartbeat:LVM): Started pgdbsrv02.cl1.local FS_data02 (ocf::heartbeat:Filesystem): Started pgdbsrv02.cl1.local fusion-fencing (stonith:fence_fusion): Started pgdbsrv02.cl1.local Performing requested modifications + Bringing node pgdbsrv01.cl1.local online Allocation scores: clone_color: DRBD_ms_data01 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_ms_data01 allocation score on pgdbsrv02.cl1.local: 0 clone_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000 clone_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 0 native_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: -INFINITY DRBD_data01:0 promotion score on pgdbsrv02.cl1.local: 10000 DRBD_data01:1 promotion score on none: 0 clone_color: DRBD_ms_data02 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_ms_data02 allocation score on pgdbsrv02.cl1.local: 0 clone_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000 clone_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 0 native_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: -INFINITY DRBD_data02:0 promotion score on pgdbsrv02.cl1.local: 10000 DRBD_data02:1 promotion score on none: 0 group_color: GRP_data01 allocation score on pgdbsrv01.cl1.local: 0 group_color: GRP_data01 allocation score on pgdbsrv02.cl1.local: 0 group_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: 0 group_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 0 group_color: FS_data01 allocation score on pgdbsrv01.cl1.local: 0 group_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0 native_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 10000 native_color: FS_data01 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0 group_color: GRP_data02 allocation score on pgdbsrv01.cl1.local: 0 group_color: GRP_data02 allocation score on pgdbsrv02.cl1.local: 0 group_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: 0 group_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 0 group_color: FS_data02 allocation score on pgdbsrv01.cl1.local: 0 group_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0 native_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 10000 native_color: FS_data02 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0 native_color: fusion-fencing allocation score on pgdbsrv01.cl1.local: 0 native_color: fusion-fencing allocation score on pgdbsrv02.cl1.local: 0 Transition Summary: And that's it. Once I unstandby node #1 and rerun the same simulate: # crm_simulate -L -s -u pgdbsrv01.cl1.local Current cluster status: Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ] Master/Slave Set: DRBD_ms_data01 [DRBD_data01] Masters: [ pgdbsrv02.cl1.local ] Slaves: [ pgdbsrv01.cl1.local ] Master/Slave Set: DRBD_ms_data02 [DRBD_data02] Masters: [ pgdbsrv02.cl1.local ] Slaves: [ pgdbsrv01.cl1.local ] Resource Group: GRP_data01 LVM_vgdata01 (ocf::heartbeat:LVM): Stopped FS_data01 (ocf::heartbeat:Filesystem): Stopped Resource Group: GRP_data02 LVM_vgdata02 (ocf::heartbeat:LVM): Stopped FS_data02 (ocf::heartbeat:Filesystem): Stopped fusion-fencing (stonith:fence_fusion): Started pgdbsrv01.cl1.local Performing requested modifications + Bringing node pgdbsrv01.cl1.local online Allocation scores: clone_color: DRBD_ms_data01 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_ms_data01 allocation score on pgdbsrv02.cl1.local: 0 clone_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 10000 clone_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000 clone_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 10000 clone_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 10000 native_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 10000 native_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: -INFINITY DRBD_data01:1 promotion score on pgdbsrv02.cl1.local: 10000 DRBD_data01:0 promotion score on pgdbsrv01.cl1.local: 10000 clone_color: DRBD_ms_data02 allocation score on pgdbsrv01.cl1.local: 0 clone_color: DRBD_ms_data02 allocation score on pgdbsrv02.cl1.local: 0 clone_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 10000 clone_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000 clone_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 10000 clone_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 10000 native_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 10000 native_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 10000 native_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: -INFINITY DRBD_data02:1 promotion score on pgdbsrv02.cl1.local: 10000 DRBD_data02:0 promotion score on pgdbsrv01.cl1.local: 10000 group_color: GRP_data01 allocation score on pgdbsrv01.cl1.local: 0 group_color: GRP_data01 allocation score on pgdbsrv02.cl1.local: 0 group_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: 0 group_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 0 group_color: FS_data01 allocation score on pgdbsrv01.cl1.local: 0 group_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0 native_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 10000 native_color: FS_data01 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0 group_color: GRP_data02 allocation score on pgdbsrv01.cl1.local: 0 group_color: GRP_data02 allocation score on pgdbsrv02.cl1.local: 0 group_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: 0 group_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 0 group_color: FS_data02 allocation score on pgdbsrv01.cl1.local: 0 group_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0 native_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 10000 native_color: FS_data02 allocation score on pgdbsrv01.cl1.local: -INFINITY native_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0 native_color: fusion-fencing allocation score on pgdbsrv01.cl1.local: 0 native_color: fusion-fencing allocation score on pgdbsrv02.cl1.local: 0 Transition Summary: * Start LVM_vgdata01 (pgdbsrv02.cl1.local - blocked) * Start FS_data01 (pgdbsrv02.cl1.local - blocked) * Start LVM_vgdata02 (pgdbsrv02.cl1.local - blocked) * Start FS_data02 (pgdbsrv02.cl1.local - blocked) -- Heikki M On 9.9.2013, at 12.02, Andreas Mock <andreas.m...@web.de> wrote: > Hi Heikki, > > it has to be crm_simulate -L -s. Sorry for the wrong command line > parameters. > > Best regards > Andreas > > > -----Ursprüngliche Nachricht----- > Von: Heikki Manninen [mailto:h...@iki.fi] > Gesendet: Montag, 9. September 2013 10:46 > An: The Pacemaker cluster resource manager > Betreff: Re: [Pacemaker] Resource ordering/colocating question (DRBD + LVM + > FS) > > Hello Andreas, thanks for your input, much appreciated. > > On 5.9.2013, at 16.39, "Andreas Mock" <andreas.m...@web.de> wrote: > >> 1) The second output of crm_mon show a resource IP_database >> which is not shown in the initial crm_mon output and also >> not in the config. => Reduce your problem/config to the >> minimum being reproducible. > > True. I edited out the resource from the e-mail that did not have anything > to do with the problem as such (works ok all the time). Just forgot to > remove it from the second copy-paste also. And yes, no more IP resource in > the configuration. > >> 2) Enable logging and look out which node is the DC. >> There in the logs you find many many informations showing >> what is going on. Hint: Open a terminal session with an >> opened tail -f logfile. Watch it while inserting commands. >> You'll get used to it. > > Seems that node #2 was the DC (also visible in the pcs status output). I > have looked at the logs all the time, just not yet too familiar with the > contents of pacemaker logging. Here's the thing that keeps repeating > everytime those LVM and FS resources stay in stopped state: > > Sep 3 20:01:23 pgdbsrv02 pengine[1667]: notice: LogActions: Start > LVM_vgdata01#011(pgdbsrv01.cl1.local - blocked) > Sep 3 20:01:23 pgdbsrv02 pengine[1667]: notice: LogActions: Start > FS_data01#011(pgdbsrv01.cl1.local - blocked) > Sep 3 20:01:23 pgdbsrv02 pengine[1667]: notice: LogActions: Start > LVM_vgdata02#011(pgdbsrv01.cl1.local - blocked) > Sep 3 20:01:23 pgdbsrv02 pengine[1667]: notice: LogActions: Start > FS_data02#011(pgdbsrv01.cl1.local - blocked) > > So what does blocked mean here? Is it that the node #1 in this case is in > need of fencing/stonithing and thus being blocked or something else (I have > a backgroud in the RHCS/HACMP/LifeKeeper etc. world). No quorum policy is > set to ignore. > >> 3) The shown status of a drbd resource (crm_mon) doesn't show >> you all informations of the drbd devices. Have a look at >> drbd-overview on both nodes. (e.g. syncing status). > > True, DRBD is working fine on these occations. Connected, Synced etc. > >> 4) This setup CRIES for stonithing. Even in a test environment. >> When stonith happens (this is what you see immediately) you >> know something went wrong. This is a good indicator for >> errors in agents or in the config. Believe me, as tedious stonithing >> is the valuable it is for getting hints for bad cluster state. >> On virtual machines stonithing is not as painful as on real >> servers. > > Very much true. I have implemented some custom fencing/stonithing agents > before on physical and virtual cluster environments. Problem being here is > that I'm not aware of reasonably simple ways to implement stonith with > VMware Fusion that I'm bound to use for this test setup. Have to dig more > into this though. So fencing from cman cluster.conf is chained to pacemaker > fencing and pacemaker stonithing is disabled, no quorum policy is ignore. > >> 5) Is the drbd fencing script enabled? If yes, in certain circumstances >> -INF rules are inserted to deny promoting of "wrong" nodes. >> You should grep for them 'cibadmin -Q | grep <resname>' > > No, DRBD fencing is not enabled and split-brain recovery is done manually. > >> 6) crm_simulate -L -v gives you an output of the scores of >> the resources on each node. I really don't know how to read it >> exactly (Is there a documentation of that anywhere?), but it >> gives you a hint where to look at, when resources don't start. >> Especially the aggregation of stickiness values in groups are >> sometimes misleading. > > Could be that I have some different version maybe, because -v is unknown > option and: > > # crm_simulate -L -V > > Current cluster status: > Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ] > > Master/Slave Set: DRBD_ms_data01 [DRBD_data01] > Masters: [ pgdbsrv01.cl1.local ] > Slaves: [ pgdbsrv02.cl1.local ] > Master/Slave Set: DRBD_ms_data02 [DRBD_data02] > Masters: [ pgdbsrv01.cl1.local ] > Slaves: [ pgdbsrv02.cl1.local ] > Resource Group: GRP_data01 > LVM_vgdata01 (ocf::heartbeat:LVM): Stopped > FS_data01 (ocf::heartbeat:Filesystem): Stopped > Resource Group: GRP_data02 > LVM_vgdata02 (ocf::heartbeat:LVM): Stopped > FS_data02 (ocf::heartbeat:Filesystem): Stopped > > > Only shows that much. > > Original problem description left quoted below. > > > -- > Heikki M > > >> -----Ursprüngliche Nachricht----- >> Von: Heikki Manninen [mailto:h...@iki.fi] >> Gesendet: Donnerstag, 5. September 2013 14:08 >> An: pacemaker@oss.clusterlabs.org >> Betreff: [Pacemaker] Resource ordering/colocating question (DRBD + LVM + > FS) >> >> Hello, >> >> I'm having a bit of a problem understanding what's going on with my simple >> two-node demo cluster here. My resources come up correctly after > restarting >> the whole cluster but the LVM and Filesystem resources fail to start after > a >> single node restart or standby/unstandby (after node comes back online - > why >> do they even stop/start after the second node comes back?). >> >> OS: CentOS 6.4 (cman stack) >> Pacemaker: pacemaker-1.1.8-7.el6.x86_64 >> DRBD: drbd84-utils-8.4.3-1.el6.elrepo.x86_64 >> >> Everything is configured using: pcs-0.9.26-10.el6_4.1.noarch >> >> Two DRBD resources configured and working: data01 & data02 >> Two nodes: pgdbsrv01.cl1.local & pgdbsrv02.cl1.local >> >> Configuration: >> >> node pgdbsrv01.cl1.local >> node pgdbsrv02.cl1.local >> primitive DRBD_data01 ocf:linbit:drbd \ >> params drbd_resource="data01" \ >> op monitor interval="30s" >> primitive DRBD_data02 ocf:linbit:drbd \ >> params drbd_resource="data02" \ >> op monitor interval="30s" >> primitive FS_data01 ocf:heartbeat:Filesystem \ >> params device="/dev/mapper/vgdata01-lvdata01" directory="/data01" >> fstype="ext4" \ >> op monitor interval="30s" >> primitive FS_data02 ocf:heartbeat:Filesystem \ >> params device="/dev/mapper/vgdata02-lvdata02" directory="/data02" >> fstype="ext4" \ >> op monitor interval="30s" >> primitive LVM_vgdata01 ocf:heartbeat:LVM \ >> params volgrpname="vgdata01" exclusive="true" \ >> op monitor interval="30s" >> primitive LVM_vgdata02 ocf:heartbeat:LVM \ >> params volgrpname="vgdata02" exclusive="true" \ >> op monitor interval="30s" >> group GRP_data01 LVM_vgdata01 FS_data01 >> group GRP_data02 LVM_vgdata02 FS_data02 >> ms DRBD_ms_data01 DRBD_data01 \ >> meta master-max="1" master-node-max="1" clone-max="2" >> clone-node-max="1" notify="true" >> ms DRBD_ms_data02 DRBD_data02 \ >> meta master-max="1" master-node-max="1" clone-max="2" >> clone-node-max="1" notify="true" >> colocation colocation-GRP_data01-DRBD_ms_data01-INFINITY inf: GRP_data01 >> DRBD_ms_data01:Master >> colocation colocation-GRP_data02-DRBD_ms_data02-INFINITY inf: GRP_data02 >> DRBD_ms_data02:Master >> order order-DRBD_data01-GRP_data01-mandatory : DRBD_data01:promote >> GRP_data01:start >> order order-DRBD_data02-GRP_data02-mandatory : DRBD_data02:promote >> GRP_data02:start >> property $id="cib-bootstrap-options" \ >> dc-version="1.1.8-7.el6-394e906" \ >> cluster-infrastructure="cman" \ >> stonith-enabled="false" \ >> no-quorum-policy="ignore" \ >> migration-threshold="1" >> rsc_defaults $id="rsc_defaults-options" \ >> resource-stickiness="100" >> >> >> 1) After starting the cluster, everything runs happily: >> >> Last updated: Tue Sep 3 00:11:13 2013 >> Last change: Tue Sep 3 00:05:15 2013 via cibadmin on pgdbsrv01.cl1.local >> Stack: cman >> Current DC: pgdbsrv02.cl1.local - partition with quorum >> Version: 1.1.8-7.el6-394e906 >> 2 Nodes configured, unknown expected votes >> 9 Resources configured. >> >> Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ] >> >> Full list of resources: >> >> Master/Slave Set: DRBD_ms_data01 [DRBD_data01] >> Masters: [ pgdbsrv01.cl1.local ] >> Slaves: [ pgdbsrv02.cl1.local ] >> Master/Slave Set: DRBD_ms_data02 [DRBD_data02] >> Masters: [ pgdbsrv01.cl1.local ] >> Slaves: [ pgdbsrv02.cl1.local ] >> Resource Group: GRP_data01 >> LVM_vgdata01 (ocf::heartbeat:LVM): Started pgdbsrv01.cl1.local >> FS_data01 (ocf::heartbeat:Filesystem): Started pgdbsrv01.cl1.local >> Resource Group: GRP_data02 >> LVM_vgdata02 (ocf::heartbeat:LVM): Started pgdbsrv01.cl1.local >> FS_data02 (ocf::heartbeat:Filesystem): Started pgdbsrv01.cl1.local >> >> 2) Putting node #1 to standby mode - after which everything runs happily > on >> node pgdbsrv02.cl1.local >> >> # pcs cluster standby pgdbsrv01.cl1.local >> # pcs status >> Last updated: Tue Sep 3 00:16:01 2013 >> Last change: Tue Sep 3 00:15:55 2013 via crm_attribute on >> pgdbsrv02.cl1.local >> Stack: cman >> Current DC: pgdbsrv02.cl1.local - partition with quorum >> Version: 1.1.8-7.el6-394e906 >> 2 Nodes configured, unknown expected votes >> 9 Resources configured. >> >> >> Node pgdbsrv01.cl1.local: standby >> Online: [ pgdbsrv02.cl1.local ] >> >> Full list of resources: >> >> IP_database (ocf::heartbeat:IPaddr2): Started pgdbsrv02.cl1.local >> Master/Slave Set: DRBD_ms_data01 [DRBD_data01] >> Masters: [ pgdbsrv02.cl1.local ] >> Stopped: [ DRBD_data01:1 ] >> Master/Slave Set: DRBD_ms_data02 [DRBD_data02] >> Masters: [ pgdbsrv02.cl1.local ] >> Stopped: [ DRBD_data02:1 ] >> Resource Group: GRP_data01 >> LVM_vgdata01 (ocf::heartbeat:LVM): Started pgdbsrv02.cl1.local >> FS_data01 (ocf::heartbeat:Filesystem): Started >> pgdbsrv02.cl1.local >> Resource Group: GRP_data02 >> LVM_vgdata02 (ocf::heartbeat:LVM): Started pgdbsrv02.cl1.local >> FS_data02 (ocf::heartbeat:Filesystem): Started >> pgdbsrv02.cl1.local >> >> 3) Putting node #1 back online - it seems that all the resources stop (?) >> and then DRBD gets promoted successfully on node #2 but LVM and FS > resources >> never start >> >> # pcs cluster unstandby pgdbsrv01.cl1.local >> # pcs status >> Last updated: Tue Sep 3 00:17:00 2013 >> Last change: Tue Sep 3 00:16:56 2013 via crm_attribute on >> pgdbsrv02.cl1.local >> Stack: cman >> Current DC: pgdbsrv02.cl1.local - partition with quorum >> Version: 1.1.8-7.el6-394e906 >> 2 Nodes configured, unknown expected votes >> 9 Resources configured. >> >> >> Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ] >> >> Full list of resources: >> >> IP_database (ocf::heartbeat:IPaddr2): Started pgdbsrv02.cl1.local >> Master/Slave Set: DRBD_ms_data01 [DRBD_data01] >> Masters: [ pgdbsrv02.cl1.local ] >> Slaves: [ pgdbsrv01.cl1.local ] >> Master/Slave Set: DRBD_ms_data02 [DRBD_data02] >> Masters: [ pgdbsrv02.cl1.local ] >> Slaves: [ pgdbsrv01.cl1.local ] >> Resource Group: GRP_data01 >> LVM_vgdata01 (ocf::heartbeat:LVM): Stopped >> FS_data01 (ocf::heartbeat:Filesystem): Stopped >> Resource Group: GRP_data02 >> LVM_vgdata02 (ocf::heartbeat:LVM): Stopped >> FS_data02 (ocf::heartbeat:Filesystem): Stopped >> >> >> >> Any ideas why this is happening/what could be wrong in the resource >> configuration? The same thing happens when testing the situation with the >> resources located vice-versa in the beginning. Also, if I stop & start one >> of the nodes, same thing happens once the node gets back online. >> >> >> -- >> Heikki Manninen <h...@iki.fi> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org