Re: [Pacemaker] resource priority

Tomcsányi, Domonkos Wed, 29 Jan 2014 09:03:19 -0800

2013.11.09. 1:00 keltezéssel, Dennis Jacobfeuerborn írta:

On 09.11.2013 00:15, Dennis Jacobfeuerborn wrote:

Hi,
I'm finally moving forward with creating a redundant gateway System for
a network but I'm running into trouble. This is the configuration that
I'm using:


node gw01 \
     attributes standby="off"
node gw02 \
     attributes standby="off"
primitive p_ip_gw_ext ocf:heartbeat:IPaddr2 \
     params ip="192.168.100.132" cidr_netmask="29" nic="eth0" \
     op monitor interval="10s"
primitive p_ip_gw_int ocf:heartbeat:IPaddr2 \
     params ip="192.168.214.4" cidr_netmask="24" nic="eth1" \
     op monitor interval="10s"
primitive p_route_ext ocf:heartbeat:Route \

params destination="default" device="eth0"gateway="192.168.100.129" \

     op monitor interval="10" timeout="20" depth="0"
primitive p_route_int ocf:heartbeat:Route \

params destination="default" device="eth1"gateway="192.168.214.1" \

     op monitor interval="10" timeout="20" depth="0"
group g_gateway p_ip_gw_ext p_ip_gw_int
colocation c_route_ext -inf: p_route_ext p_ip_gw_ext
colocation c_routes -inf: p_route_ext p_route_int
property $id="cib-bootstrap-options" \
     dc-version="1.1.10-1.el6_4.4-368c726" \
     cluster-infrastructure="cman" \
     stonith-enabled="false" \
     no-quorum-policy="ignore" \
     last-lrm-refresh="1383949342"

The setup is fairly simple. One IP on the public interface, one IP on
the private interface. On the active system the default route is
configured through the public interface, on the secondary system the
default route is configured through the private interface.

The problem is that when I put the active node on standby the
p_route_int resource stays on the secondary system and the p_route_ext
resource gets stopped.

My interpretation is that since there is no explicit priority defined
for either route and with only one node online only one route can be
placed that pacemaker decides arbitrarily to keep p_route_int online and
take p_route_ext offline.

What I really want to express is that p_route_ext should always be
placed first and p_route_int only be placed if possible (i.e. forced
migration) but if not should be taken offline instead (i.e. a node is
down).
Any ideas on how to accomplish this?

After sending the mail it occured to me that "place one route not onthe same node as the other" is a wrong way to look at this. Instead Idid this:


colocation c_route_ext inf: p_route_ext p_ip_gw_ext
colocation c_route_int -inf: p_route_int p_ip_gw_ext

i.e. place the ext route with the ext ip and the int route on a nodewhere the ext ip is *not* running. That way the routes no longerdepend on each other and the priority of the routes no longer matters.


Sorry for the noise.

Regards,
  Dennis

Hi Dennis & list,

I am having a very similar problem with my cluster, but I am not sure ifyour solution is truly good, so let me tell you about what I justexperienced:

I have two resource groups, one is responsible for having a VPN +external IP + internal interface IP for routing traffic from othermachines through the VPN tunnel and the other one is just a singlerouting resource which routes the other node's traffic through the VPNtunnel.If everything is fine then node1 has the VPN tunnel up and node2 has therouting resource, meaning it sends all traffic through VPN on node1.This is achieved by saying:


colocation routes -inf: vpn_resource_group route_resource_group

If both nodes are up and running then everything is fine, but today Istopped node2 (which was running the route_resource_group) and to mysurprise (well, for the first 30 seconds before I realized what probablyhappened) the VPN-service group stopped too. According to my theory thisis what happened: Corosync saw that the route_resource_group isn'trunning anywhere, so it tried to start it on node1, but the colocationrule told it not to do so, ergo the route_resource_group went to'failed' state, but since it is in colocation with thevpn_resource_group the vpn_resource_group failed too, bringing the wholecluster and all my services down.

So I am wondering how one could tell corosync that if there is no way torun resource2 (route_resource_group) then let that stop but keep runningresource1 (vpn_resource_group).


Thank you a lot!
Domonkos

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] resource priority

Reply via email to