Thanks
----- Original Message ----- > From: "Rohit Yadav" <rohit.ya...@shapeblue.com> > To: "dev" <dev@cloudstack.apache.org>, "dev" <dev@cloudstack.apache.org> > Sent: Friday, 20 April, 2018 10:35:55 > Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT > Hi Andrei, > > I've fixed this recently, please see > https://github.com/apache/cloudstack/pull/2579 > > As a workaround you can add routing rules manually. On the PR, there is a link > to a comment that explains the issue and suggests manual workaround. Let me > know if that works for you. > > Regards. > > > From: Andrei Mikhailovsky > Sent: Friday, 20 April, 2:21 PM > Subject: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT > To: dev > > > Hello, I have been posting to the users thread about this issue. here is a > quick > summary in case if people contributing to the source nat code on the VPC side > would like to fix this issue. Problem summary: no connectivity between virtual > machines behind two Static NAT networks. Problem case: When one virtual > machine > sends a packet to the external address of the another virtual machine that are > handled by the same router and both are behind the Static NAT the traffic does > not work. 10.1.10.100 10.1.10.1:eth2 eth3:10.1.20.1 10.1.20.100 virt1 router > virt2 178.248.108.77:eth1:178.248.108.113 a single packet is send from virt1 > to > virt2. stage1: it arrives to the router on eth2 and enters "nat_PREROUTING" > IN=eth2 OUT= SRC=10.1.10.100 DST=178.248.108.113) goes through the "10 1K DNAT > all -- * * 0.0.0.0/0 178.248.108.113 to:10.1.20.100 " rule and has the DST > DNATED to the internal IP of the virt2 stage2: Enters the FORWARDING chain and > is being DROPPED by the default policy. DROPPED:IN=eth2 OUT=eth1 > SRC=10.1.10.100 DST=10.1.20.100 The reason being is that the OUT interface is > not correctly changed from eth1 to eth3 during the nat_PREROUTING so the > packet > is not intercepted by the FORWARD rule and thus not accepted. "24 14K > ACL_INBOUND_eth3 all -- * eth3 0.0.0.0/0 10.1.20.0/24" stage3: manually > inserted rule to accept this packet for FORWARDING. the packet enters the > "nat_POSTROUTING" chain IN= OUT=eth1 SRC=10.1.10.100 DST=10.1.20.100 and has > the SRC changed to the external IP 16 1320 SNAT all -- * eth1 10.1.10.100 > 0.0.0.0/0 to:178.248.108.77 and is sent to the external network on eth1. > 13:37:44.834341 IP 178.248.108.77 > 10.1.20.100: ICMP echo request, id 2644, > seq 2, length 64 For some reason, during the nat_PREROUTING stage the DST_IP > is > changed, but the OUT interface still reflects the interface associated with > the > old DST_IP. Here is the routing table # ip route list default via > 178.248.108.1 > dev eth1 10.1.10.0/24 dev eth2 proto kernel scope link src 10.1.10.1 > 10.1.20.0/24 dev eth3 proto kernel scope link src 10.1.20.1 169.254.0.0/16 dev > eth0 proto kernel scope link src 169.254.0.5 178.248.108.0/25 dev eth1 proto > kernel scope link src 178.248.108.101 # ip rule list 0: from all lookup local > 32761: from all fwmark 0x3 lookup Table_eth3 32762: from all fwmark 0x2 lookup > Table_eth2 32763: from all fwmark 0x1 lookup Table_eth1 32764: from > 10.1.0.0/16 > lookup static_route_back 32765: from 10.1.0.0/16 lookup static_route 32766: > from all lookup main 32767: from all lookup default Further into the > investigation, the problem was pinned down to those rules. All the traffic > from > internal IP on the static NATed connection were forced to go to the outside > interface (eth1), by setting the mark 0x1 and then using the matching # ip > rule > to direct it. #iptables -t mangle -L PREROUTING -vn Chain PREROUTING (policy > ACCEPT 97 packets, 11395 bytes) pkts bytes target prot opt in out source > destination 49 3644 CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW > CONNMARK save 37 2720 MARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW MARK set > 0x1 37 2720 CONNMARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW CONNMARK save > 114 8472 MARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW MARK set 0x1 114 8472 > CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW CONNMARK save # ip rule 0: > from all lookup local 32761: from all fwmark 0x3 lookup Table_eth3 32762: from > all fwmark 0x2 lookup Table_eth2 32763: from all fwmark 0x1 lookup Table_eth1 > 32764: from 10.1.0.0/16 lookup static_route_back 32765: from 10.1.0.0/16 > lookup > static_route 32766: from all lookup main 32767: from all lookup default The > acceptable solution is to delete those rules all together.? The problem with > such approach is that the inter VPC traffic will use the internal IP > addresses, > so the packets going from 178.248.108.77 to 178.248.108.113 would be seen as > communication between 10.1.10.100 and 10.1.20.100 thus we need to apply > further > two rules # iptables -t nat -I POSTROUTING -o eth3 -s 10.1.10.0/24 -d > 10.1.20.0/24 -j SNAT --to-source 178.248.108.77 # iptables -t nat -I > POSTROUTING -o eth2 -s 10.1.20.0/24 -d 10.1.10.0/24 -j SNAT --to-source > 178.248.108.113 in order to make sure that the packets leaving the router > would > have correct source IP. This way it is possible to have static NAT on all of > the IPS within the VPC and ensure a successful communication between them. So, > for a quick and dirty fix, we ran this command on the VR: for i in iptables -t > mangle -L PREROUTING -vn | awk '/0x1/ && !/eth1/ {print $8}'; do iptables -t > mangle -D PREROUTING -s $i -m state —state NEW -j MARK —set-mark "0x1" ; done > The issue has been introduced around early 4.9.x releases I believe. Thanks > Andrei > rohit.ya...@shapeblue.com > www.shapeblue.com > 53 Chandos Place, Covent Garden, London WC2N 4HSUK > @shapeblue > > > > ----- Original Message ----- > From: "Andrei Mikhailovsky" > To: "users" > > Sent: > Monday, 16 April, 2018 22:32:25 > Subject: Re: Upgrade from ACS 4.9.3 to > 4.11.0 > > Hello, > > I have done some more testing with the VPC network tiers and it > seems that the > Static NAT is indeed causing connectivity issues. Here is > what > I've done: > > > Setup 1. I have created two test network tiers with one guest > vm in each tier. > Static NAT is NOT enabled. Each VM has a port forwarding > rule (port 22) from > its dedicated public IP address. ACLs have been setup to > allow traffic on port > 22 from the private ip addresses on each network tier. > > > 1. ACLs seems to work just fine. traffic between the networks flows > according to > the rules. both vms can see each other's private IPs and can > ping/ssh/etc > > 2. From the Internet hosts can access vms on port 22 > > 4. > The vms can also access each other and itself on their public IPs. I don't > > think this worked before, but could be wrong. > > > > Setup 2. Everything the > same as Setup 1, but one public IP address has been > setup as Static NAT to > one guest vm. the second guest vm and second public IP > remained unchanged. > > > 1. ACLs stopped working correctly (see below) > > 2. From the Internet hosts > can access vms on port 22, including the Static NAT > vm > > 3. Other guest > vms > can access the Static NAT vm using private & public IP > addresses > > 4. > Static NAT vm can NOT access other vms neither using public nor private IPs > > > > 5. Static NAT vm can access the internet hosts (apart from the public IP range > > belonging to the cloudstack setup) > > > The above behaviour of Setup 2 > scenarios is very strange, especially points 4 & > 5. > > Any thoughts anyone? > > > Cheers > > ----- Original Message ----- >> From: "Rohit Yadav" >> To: > "users" >> Sent: Thursday, 12 April, 2018 12:06:54 >> Subject: Re: Upgrade > from > ACS 4.9.3 to 4.11.0 > >> Hi Andrei, >> >> >> Thanks for sharing, yes the > egress > thing is a known issue which is caused due to >> failure during VR setup to > create egress table. By performing a restart of the >> network (without > cleanup > option selected), the egress table gets created and >> rules are successfully > applied. >> >> >> The issue has been fixed in the vr downtime pr: >> >> > https://github.com/apache/cloudstack/pull/2508 >> >> >> - Rohit >> >> >> >> >> > >> ________________________________ >> From: Andrei Mikhailovsky >> Sent: > Tuesday, April 3, 2018 3:33:43 PM >> To: users >> Subject: Re: Upgrade from > ACS > 4.9.3 to 4.11.0 >> >> Rohit, >> >> Following the update from 4.9.3 to 4.11.0, > I > would like to comment on a few >> things: >> >> 1. The upgrade went well, a > part from the cloudstack-management server startup >> issue that I've > described > in my previous email. >> 2. there was an issue with the virtual router > template > upgrade. The issue is >> described below: >> >> VR template upgrade issue: >> > >> After updating the systemvm template I went onto the Infrastructure > > Virtual >> Routers and selected the Update template option for each virtual > router. The >> virtual routers were updated successfully using the new > templates. However, >> this has broken ALL Egress rules on all networks and > none of the guest vms. >> Port forwarding / incoming rules were working just > fine. Removal and addition >> of Egress rules did not fix the issue. To fix > the > issue I had to restart each >> of the networks with the Clean up option > ticked. > >> >> >> Cheers >> >> Andrei >> >> rohit.ya...@shapeblue.com >> > www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >> > @shapeblue >> >> >> >> ----- Original Message ----- >>> From: "Andrei > Mikhailovsky" >>> To: "users" >>> Sent: Monday, 2 April, 2018 21:44:27 >>> > Subject: Re: Upgrade from ACS 4.9.3 to 4.11.0 >> >>> Hi Rohit, >>> >>> > Following some further investigation it seems that the installation packages > >>> replaced the following file: >>> >>> /etc/default/cloudstack-management > >>> >>> > >>> with >>> >>> /etc/default/cloudstack-management.dpkg-dist >>> >>> >>> > >>> Thus, > the management server couldn't load the env variables and thus was unable >>> > to start. >>> >>> I've put the file back and the management server is able to > start. >>> >>> I will let you know if there are any other issues/problems. >>> > >>> Cheers >>> >>> Andrei >>> >>> >>> >>> ----- Original Message ----- >>>> > From: "Andrei Mikhailovsky" >>>> To: "users" >>>> Sent: Monday, 2 April, 2018 > 20:58:59 >>>> Subject: Re: Upgrade from ACS 4.9.3 to 4.11.0 >>> >>>> Hi Rohit, > >>>> >>>> I have just upgraded and having issues starting the service with the > following >>>> error: >>>> >>>> >>>> Apr 02 20:56:37 ais-cloudhost13 > systemd[1]: cloudstack-management.service: >>>> Failed to load environment > files: No such file or directory >>>> Apr 02 20:56:37 ais-cloudhost13 > systemd[1]: cloudstack-management.service: >>>> Failed to run 'start-pre' > task: > No such file or directory >>>> Apr 02 20:56:37 ais-cloudhost13 systemd[1]: > Failed to start CloudStack >>>> Management Server. >>>> -- Subject: Unit > cloudstack-management.service has failed >>>> -- Defined-By: systemd >>>> >>>> > Cheers >>>> >>>> Andrei >>>> >>>> ----- Original Message ----- >>>>> From: > "Rohit Yadav" >>>>> To: "users" >>>>> Sent: Friday, 30 March, 2018 19:17:48 > >>>>> Subject: Re: Upgrade from ACS 4.9.3 to 4.11.0 >>>> >>>>> Some of the > upgrade and minor issues have been fixed and will make their way >>>>> into > 4.11.1.0. You're welcome to upgrade and share your feedback, but bear in >>>>> > mind due to some changes a new/updated systemvmtemplate need to be issued for > >>>>> 4.11.1.0 (it will be compatible for both 4.11.0.0 and 4.11.1.0 releases, > but >>>>> 4.11.0.0 users will have to register that new template). >>>>> >>>>> > >>>>> >>>>> - Rohit >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> > ________________________________ >>>>> From: Andrei Mikhailovsky >>>>> Sent: > Friday, March 30, 2018 11:00:34 PM >>>>> To: users >>>>> Subject: Upgrade from > ACS 4.9.3 to 4.11.0 >>>>> >>>>> Hello, >>>>> >>>>> My current infrastructure > is > ACS 4.9.3 with KVM based on Ubuntu 16.04 servers >>>>> for the KVM hosts and > the management server. >>>>> >>>>> I am planning to perform an upgrade from > ACS > 4.9.3 to 4.11.0 and was wondering >>>>> if anyone had any issues during the > upgrades? Anything to watch out for? >>>>> >>>>> I have previously seen issues > with upgrading to 4.10, which required some manual >>>>> db updates from what > I > recall. Has this issue been fixed in the 4.11 upgrade >>>>> process? >>>>> > >>>>> thanks >>>>> >>>>> Andrei >>>>> >>>>> rohit.ya...@shapeblue.com >>>>> > www.shapeblue.com >>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK > > > > > > @shapeblue