Sorry yes those are typos - Im having a bad day ;) Should have been:
order san1order inf: cs1vg1 san1vip ( cs1lb1grp cs1man1grp cs1master1grp cs1ddb1grp cs1dws1grp ) order san2order inf: cs1vg2 san2vip ( cs1lb2grp ) On 20 Nov 2012, at 19:16, Jake Smith wrote: > > > > ----- Original Message ----- >> From: "Craig Donnelly" <cr...@goaf.net> >> To: pacemaker@oss.clusterlabs.org >> Sent: Tuesday, November 20, 2012 1:56:03 PM >> Subject: [Pacemaker] colocation conundrum >> >> Hi there, >> >> I think Ive exhausted everything I can find online in terms of trying >> to solve my problem so here goes with a posting to see if anyone on >> this mailing list might be able to help please. >> >> I have a pacemaker1.1.7/corosync 1.4.1 two node cluster running on >> CentOS 6.3. >> Im using this cluster to support shared storage using a combination >> of LVM and iSCSI. >> >> Now failover works fine if I offline/stonith a node. However when I >> bring the node back online they enter a death-match situation. >> I see the issue as being with ordering/colocation/resource sets and I >> have tried a bunch of different variations and read and re-read all >> the information I can find online without resolution. >> >> Would really appreciate any help/advise. >> >> The key entries that I can see in the logs are: >> >> NODE1: >> ====== >> Nov 20 12:16:38 cs1san1 iSCSILogicalUnit(cs1lb1l1)[2710]: ERROR: >> tgtadm: invalid request >> Nov 20 12:16:39 cs1san1 iSCSILogicalUnit(cs1man1l1)[2807]: ERROR: >> tgtadm: invalid request >> Nov 20 12:22:55 cs1san1 iSCSILogicalUnit(cs1master1l1)[4482]: ERROR: >> tgtadm: invalid request >> Nov 20 12:23:17 cs1san1 iSCSILogicalUnit(cs1ddb1l1)[4968]: ERROR: >> tgtadm: invalid request >> Nov 20 12:23:18 cs1san1 iSCSILogicalUnit(cs1master1l1)[5081]: ERROR: >> tgtadm: invalid request >> Nov 20 12:30:28 cs1san1 iSCSILogicalUnit(cs1lb1l1)[2670]: ERROR: >> tgtadm: invalid request >> >> NODE2: >> ====== >> Nov 20 12:16:38 cs1san2 LVM(cs1vg1)[22039]: ERROR: Can't deactivate >> volume group "cs1vg1" with 3 open logical volume(s) >> Nov 20 12:22:55 cs1san2 LVM(cs1vg1)[3386]: ERROR: Can't deactivate >> volume group "cs1vg1" with 2 open logical volume(s) >> Nov 20 12:23:17 cs1san2 LVM(cs1vg1)[4296]: ERROR: Can't deactivate >> volume group "cs1vg1" with 1 open logical volume(s) >> Nov 20 12:30:27 cs1san2 LVM(cs1vg1)[14943]: ERROR: Can't deactivate >> volume group "cs1vg1" with 4 open logical volume(s) >> >> which, to me, clearly indicates an ordering issue yet the >> configuration I have follows the colocation/ordering rules in as >> much as I can understand them. >> >> My "current" CRM config is as follows: >> ============================================================================== >> node cs1san1 \ >> attributes standby="off" >> node cs1san2 \ >> attributes standby="off" >> primitive alert ocf:heartbeat:MailTo \ >> params email="o...@xyz.com" subject="CS takeover event" \ >> op monitor interval="10s" >> primitive cs1ddb1l1 ocf:heartbeat:iSCSILogicalUnit \ >> params target_iqn="iqn.2012-10.com.xyz.cs1san1:cs1ddb1d1" lun="1" >> path="/dev/cs1vg1/cs1ddb1d1" \ >> op monitor interval="10" timeout="15" >> primitive cs1ddb1t1 ocf:heartbeat:iSCSITarget \ >> params iqn="iqn.2012-10.com.xyz.cs1san1:cs1ddb1d1" tid="7" \ >> op monitor interval="10" timeout="15" >> primitive cs1dws1l1 ocf:heartbeat:iSCSILogicalUnit \ >> params target_iqn="iqn.2012-10.com.xyz.cs1san1:cs1dws1d1" lun="1" >> path="/dev/cs1vg1/cs1dws1d1" \ >> op monitor interval="10" timeout="15" >> primitive cs1dws1t1 ocf:heartbeat:iSCSITarget \ >> params iqn="iqn.2012-10.com.xyz.cs1san1:cs1dws1d1" tid="8" \ >> op monitor interval="10" timeout="15" >> primitive cs1lb1l1 ocf:heartbeat:iSCSILogicalUnit \ >> params target_iqn="iqn.2012-10.com.xyz.cs1san1:cs1lb1d1" lun="1" >> path="/dev/cs1vg1/cs1lb1d1" \ >> op start interval="0" timeout="15" \ >> op stop interval="0" timeout="15" \ >> op monitor interval="10" timeout="15" \ >> meta is-managed="true" >> primitive cs1lb1t1 ocf:heartbeat:iSCSITarget \ >> params iqn="iqn.2012-10.com.xyz.cs1san1:cs1lb1d1" tid="1" \ >> op monitor interval="10" timeout="15" >> primitive cs1lb2l1 ocf:heartbeat:iSCSILogicalUnit \ >> params target_iqn="iqn.2012-10.com.xyz.cs1san2:cs1lb2d1" lun="1" >> path="/dev/cs1vg2/cs1lb2d1" \ >> op start interval="0" timeout="15" \ >> op stop interval="0" timeout="15" \ >> op monitor interval="10" timeout="15" >> primitive cs1lb2t1 ocf:heartbeat:iSCSITarget \ >> params iqn="iqn.2012-10.com.xyz.cs1san2:cs1lb2d1" tid="2" \ >> op monitor interval="10" timeout="15" >> primitive cs1man1l1 ocf:heartbeat:iSCSILogicalUnit \ >> params target_iqn="iqn.2012-10.com.xyz.cs1san1:cs1man1d1" lun="1" >> path="/dev/cs1vg1/cs1man1d1" \ >> op monitor interval="10" timeout="15" >> primitive cs1man1t1 ocf:heartbeat:iSCSITarget \ >> params iqn="iqn.2012-10.com.xyz.cs1san1:cs1man1d1" tid="5" \ >> op monitor interval="10" timeout="15" >> primitive cs1master1l1 ocf:heartbeat:iSCSILogicalUnit \ >> params target_iqn="iqn.2012-10.com.xyz.cs1san1:cs1master1d1" lun="1" >> path="/dev/cs1vg1/cs1master1d1" \ >> op monitor interval="10" timeout="15" >> primitive cs1master1t1 ocf:heartbeat:iSCSITarget \ >> params iqn="iqn.2012-10.com.xyz.cs1san1:cs1master1d1" tid="6" \ >> op monitor interval="10" timeout="15" >> primitive cs1vg1 ocf:heartbeat:LVM \ >> params exclusive="true" volgrpname="cs1vg1" \ >> op start interval="0" timeout="30s" \ >> op stop interval="0" timeout="30s" \ >> meta target-role="Started" >> primitive cs1vg2 ocf:heartbeat:LVM \ >> params exclusive="true" volgrpname="cs1vg2" \ >> op start interval="0" timeout="30s" \ >> op stop interval="0" timeout="30s" \ >> meta target-role="Started" >> primitive ping ocf:pacemaker:ping \ >> params host_list="10.96.0.1 10.96.0.2" attempts="3" timeout="2s" >> multiplier="100" dampen="5s" \ >> op monitor interval="10s" >> primitive san1fencer stonith:fence_ipmilan \ >> params pcmk_host_list="cs1san1" lanplus="1" ipaddr="10.96.0.21" >> login="admin" passwd="xxxxxxx" power_wait="4s" \ >> op monitor interval="60s" \ >> meta target-role="Started" >> primitive san1vip ocf:heartbeat:IPaddr2 \ >> params ip="10.94.0.101" cidr_netmask="24" \ >> op monitor interval="10s" \ >> meta target-role="Started" >> primitive san2fencer stonith:fence_ipmilan \ >> params pcmk_host_list="cs1san2" lanplus="1" ipaddr="10.96.0.22" >> login="admin" passwd="xxxxxxxx" power_wait="4s" \ >> op monitor interval="60s" \ >> meta target-role="Started" >> primitive san2vip ocf:heartbeat:IPaddr2 \ >> params ip="10.94.0.102" cidr_netmask="24" \ >> op monitor interval="10s" \ >> meta target-role="Started" >> group cs1ddb1grp cs1ddb1t1 cs1ddb1l1 \ >> meta target-role="Started" >> group cs1dws1grp cs1dws1t1 cs1dws1l1 \ >> meta target-role="Started" >> group cs1lb1grp cs1lb1t1 cs1lb1l1 \ >> meta target-role="Started" >> group cs1lb2grp cs1lb2t1 cs1lb2l1 \ >> meta target-role="Started" >> group cs1man1grp cs1man1t1 cs1man1l1 \ >> meta target-role="Started" >> group cs1master1grp cs1master1t1 cs1master1l1 \ >> meta target-role="Started" >> clone alerts alert \ >> meta target-role="Started" >> clone pings ping \ >> meta target-role="Started" >> location san1fence san1fencer -inf: cs1san1 >> location san1loc cs1vg1 \ >> rule $id="san1loc-rule1" 50: #uname eq cs1san1 \ >> rule $id="san1loc-rule2" pingd: defined ping >> location san2fence san2fencer -inf: cs1san2 >> location san2loc cs1vg2 \ >> rule $id="san2loc-rule1" 50: #uname eq cs1san2 \ >> rule $id="san2loc-rule2" pingd: defined ping >> colocation san1colo inf: ( cs1lb1grp cs1man1grp cs1master1grp >> cs1ddb1grp cs1dws1grp ) san1vip cs1vg1 >> colocation san2colo inf: ( cs1lb2grp ) san2vip cs1vg2 > > First my disclaimer - I don't use pacemaker for iSCSI so I'm not sure about > *correct* ordering for iSCSI. > > But after quick glance it looks like you are missing the ordering statements > that coincide with your colocation statements. > Something like this I would assume: > order san1order inf: cs1vg1 san1vip ( cs1lb1grp cs1man1grp cs1master1grp > cs1ddb1grp cs1dws1grp ) > order san2order inf: cs1vg2 san2vip ( cs1lb2grp ) > > HTH > > Jake > >> property $id="cib-bootstrap-options" \ >> dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ >> cluster-infrastructure="openais" \ >> expected-quorum-votes="2" \ >> no-quorum-policy="ignore" \ >> last-lrm-refresh="1353428951" \ >> stonith-enabled="true" \ >> maintenance-mode="false" >> =================================================================== >> >> Regards >> Craig >> >> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org