[ https://issues.apache.org/jira/browse/CLOUDSTACK-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245523#comment-15245523 ]
dsclose commented on CLOUDSTACK-9339: ------------------------------------- Thanks Wei - I'll take a look at that patch now. As it stands I think I know why sorting the device list in CsAddress.py didn't work. Cloudstack is generating a VR Config File with multiple ip_associations.json sections - one for each public NIC. Probably not an issue in itself but the sections are not necessarily given in order of the device IDs. The first device given in the VR Config file appears to correlate with the gateway assigned to the main routing table. This means that CsAddress.py is just not the right place to sort the interfaces. If it's necessary then it would need to be done either by CS or by the update_config.py mechanism. Either way, this issue may have already been solved by your patch - I'll check and let you know. > Virtual Routers don't handle Multiple Public Interfaces > ------------------------------------------------------- > > Key: CLOUDSTACK-9339 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9339 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Virtual Router > Affects Versions: 4.8.0 > Reporter: dsclose > Labels: firewall, nat, router > > There are a series of issues with the way Virtual Routers manage multiple > public interfaces. These are more pronounced on redundant virtual router > setups. I have not attempted to examine these issues in a VPC context. > Outside of a VPC context, however, the following is expected behaviour: > * eth0 connects the router to the guest network. > * In RvR setups, keepalived manages the guests' gateway IP as a virtual IP on > eth0. > * eth1 provides a local link to the hypervisor, allowing Cloudstack to issue > commands to the router. > * eth2 is the routers public interface. By default, a single public IP will > be setup on eth2 along with the necessary iptables and ip rules to source-NAT > guest traffic to that public IP. > * When a public IP address is assigned to the router that is on a separate > subnet to the source-NAT IP, a new interface is configured, such as eth3, and > the IP is assigned to that interface. > * This can result in eth3, eth4, eth5, etc. being created depending upon how > many public subnets the router has to work with. > The above all works. The following, however, is currently not working: > * Public interfaces should be set to DOWN on backup redundant routers. The > master.py script is responsible for setting public interfaces to UP during a > keepalived transition. Currently the check_is_up method of the CsIP class > brings all interfaces UP on both RvR. A proposed fix for this has been > discussed on the mailing list. That fix will leave public interfaces DOWN on > RvR allowing the keepalived transition to control the state of public > interfaces. Issue #1413 includes a commit that contradicts the proposed fix > so it is unclear what the current state of the code should be. > * Newly created interfaces should be set to UP on master redundant routers. > Assuming public interfaces should be default be DOWN on an RvR we need to > accommodate the fact that, as interfaces are created, no keepalived > transition occurs. This means that assigning an IP from a new public subnet > will have no effect (as the interface will be down) until the network is > restarted with a "clean up." > * Public interfaces other than eth2 do not forward traffic. There are two > iptables rules in the FORWARD chain of the filter table created for eth2 that > allow forwarding between eth2 and eth0. Equivalent rules are not created for > other public interfaces so forwarded traffic is dropped. > * Outbound traffic from guest VMs does not honour static-NAT rules. Instead, > outbound traffic is source-NAT'd to the networks default source-NAT IP. New > connections from guests that are destined for public networks are processed > like so: > 1. Traffic is matched against the following rule in the mangle table that > marks the connection with a 0x0: > *mangle > -A PREROUTING -i eth0 -m state --state NEW -j CONNMARK --set-xmark > 0x0/0xffffffff > 2. There are no "ip rule" statements that match a connection marked 0x0, so > the kernel routes the connection via the default gateway. That gateway is on > source-NAT subnet, so the connection is routed out of eth2. > 3. The following iptables rules are then matched in the filter table: > *filter > -A FORWARD -i eth0 -o eth2 -j FW_OUTBOUND > -A FW_OUTBOUND -j FW_EGRESS_RULES > -A FW_EGRESS_RULES -j ACCEPT > 4. Finally, the following rule is matched from the nat table, where the IP > address is the source-NAT IP: > *nat > -A POSTROUTING -o eth2 -j SNAT --to-source 123.4.5.67 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)