2013.11.09. 1:00 keltezéssel, Dennis Jacobfeuerborn írta:
On 09.11.2013 00:15, Dennis Jacobfeuerborn wrote:
Hi,
I'm finally moving forward with creating a redundant gateway System for
a network but I'm running into trouble. This is the configuration that
I'm using:

node gw01 \
     attributes standby="off"
node gw02 \
     attributes standby="off"
primitive p_ip_gw_ext ocf:heartbeat:IPaddr2 \
     params ip="192.168.100.132" cidr_netmask="29" nic="eth0" \
     op monitor interval="10s"
primitive p_ip_gw_int ocf:heartbeat:IPaddr2 \
     params ip="192.168.214.4" cidr_netmask="24" nic="eth1" \
     op monitor interval="10s"
primitive p_route_ext ocf:heartbeat:Route \
params destination="default" device="eth0" gateway="192.168.100.129" \
     op monitor interval="10" timeout="20" depth="0"
primitive p_route_int ocf:heartbeat:Route \
params destination="default" device="eth1" gateway="192.168.214.1" \
     op monitor interval="10" timeout="20" depth="0"
group g_gateway p_ip_gw_ext p_ip_gw_int
colocation c_route_ext -inf: p_route_ext p_ip_gw_ext
colocation c_routes -inf: p_route_ext p_route_int
property $id="cib-bootstrap-options" \
     dc-version="1.1.10-1.el6_4.4-368c726" \
     cluster-infrastructure="cman" \
     stonith-enabled="false" \
     no-quorum-policy="ignore" \
     last-lrm-refresh="1383949342"

The setup is fairly simple. One IP on the public interface, one IP on
the private interface. On the active system the default route is
configured through the public interface, on the secondary system the
default route is configured through the private interface.

The problem is that when I put the active node on standby the
p_route_int resource stays on the secondary system and the p_route_ext
resource gets stopped.

My interpretation is that since there is no explicit priority defined
for either route and with only one node online only one route can be
placed that pacemaker decides arbitrarily to keep p_route_int online and
take p_route_ext offline.

What I really want to express is that p_route_ext should always be
placed first and p_route_int only be placed if possible (i.e. forced
migration) but if not should be taken offline instead (i.e. a node is
down).
Any ideas on how to accomplish this?

After sending the mail it occured to me that "place one route not on the same node as the other" is a wrong way to look at this. Instead I did this:

colocation c_route_ext inf: p_route_ext p_ip_gw_ext
colocation c_route_int -inf: p_route_int p_ip_gw_ext

i.e. place the ext route with the ext ip and the int route on a node where the ext ip is *not* running. That way the routes no longer depend on each other and the priority of the routes no longer matters.

Sorry for the noise.

Regards,
  Dennis

Hi Dennis & list,

I am having a very similar problem with my cluster, but I am not sure if your solution is truly good, so let me tell you about what I just experienced:

I have two resource groups, one is responsible for having a VPN + external IP + internal interface IP for routing traffic from other machines through the VPN tunnel and the other one is just a single routing resource which routes the other node's traffic through the VPN tunnel. If everything is fine then node1 has the VPN tunnel up and node2 has the routing resource, meaning it sends all traffic through VPN on node1. This is achieved by saying:

colocation routes -inf: vpn_resource_group route_resource_group

If both nodes are up and running then everything is fine, but today I stopped node2 (which was running the route_resource_group) and to my surprise (well, for the first 30 seconds before I realized what probably happened) the VPN-service group stopped too. According to my theory this is what happened: Corosync saw that the route_resource_group isn't running anywhere, so it tried to start it on node1, but the colocation rule told it not to do so, ergo the route_resource_group went to 'failed' state, but since it is in colocation with the vpn_resource_group the vpn_resource_group failed too, bringing the whole cluster and all my services down.

So I am wondering how one could tell corosync that if there is no way to run resource2 (route_resource_group) then let that stop but keep running resource1 (vpn_resource_group).

Thank you a lot!
Domonkos

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to