[ https://issues.apache.org/jira/browse/CLOUDSTACK-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Remi Bergsma updated CLOUDSTACK-9017: ------------------------------------- Priority: Major (was: Minor) > VPC VR DHCP broken for multihomed guest VMs > ------------------------------------------- > > Key: CLOUDSTACK-9017 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9017 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: SystemVM, Virtual Router > Affects Versions: 4.5.2, 4.6.0, 4.7.0, 4.6.1, 4.6.2 > Environment: CloudStack 4.5.2, XenServer back end. > Reporter: Dag Sonstebo > Labels: systemvm, virtualrouter, vpc > > Bug: VPC VR DHCP broken for multihomed guest VMs > Affected version: CloudStack 4.5.2 only tested > Summary: When attaching a guest VM to more than one VPC tier DHCP will only > work for the last NIC to be added. This is according to end user new > behaviour after the CS4.5.2 upgrade. > Workarounds: > 1) Only use single NICs on VPC connected VMs and configure L3 routing and > ACLs to handle traffic between tiers. > 2) Configure additional tier NICs with the static IP addresses reported by > CloudStack. > > ================================================================================================================ > Steps to recreate: > 1) Create a VPC with two tiers, in this case > - VPC on 10.3.0.0/16 > - Tier 1 on 10.3.1.0/24 > - Tier 2 on 10.3.2.0/24 > 2) Create a new VM attached to tier 1 only. This will cause a new entry to be > written to /etc/dhcphosts.txt on the VPC VR: > root@r-20-VM:~# cat /etc/dhcphosts.txt > 02:00:21:fd:00:08,set:10_3_1_162,10.3.1.162,BatVM2,infinite > root@r-20-VM:~# > When the VM starts up the following is displayed in /var/log/dnsmasq.log when > the VM requests it's IP address: > Oct 30 15:50:12 dnsmasq[8246]: read /etc/hosts - 7 addresses > Oct 30 15:50:12 dnsmasq-dhcp[8246]: read /etc/dhcphosts.txt > Oct 30 15:50:12 dnsmasq-dhcp[8246]: read /etc/dhcpopts.txt > Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 > Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPOFFER(eth2) 10.3.1.162 > 02:00:21:fd:00:08 > Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPREQUEST(eth2) 10.3.1.162 > 02:00:21:fd:00:08 > Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPACK(eth2) 10.3.1.162 > 02:00:21:fd:00:08 BatVM2 > The following is displayed in the dnsmasq leases file: > root@r-20-VM:~# cat /var/lib/misc/dnsmasq.leases > 0 02:00:21:fd:00:08 10.3.1.162 BatVM2 * > And the following in the cloud DHCP configuration file: > root@r-20-VM:~# cat /etc/dnsmasq.d/cloud.conf > dhcp-hostsfile=/etc/dhcphosts.txt > dhcp-range=interface:eth3,set:interface-eth3,10.3.2.1,static > dhcp-option=tag:interface-eth3,15,batvpc.net > dhcp-range=interface:eth2,set:interface-eth2,10.3.1.1,static > dhcp-option=tag:interface-eth2,15,batvpc.net > root@r-20-VM:~# > 3) Checking the VM locally IP configuration will show DHCP lease in place for > eth0. > 4) Add a new NIC to the VM, attached to Tier 2. This results in the > following entries in the dnsmasq log: > Oct 30 16:23:02 dnsmasq[8246]: read /etc/hosts - 7 addresses > Oct 30 16:23:02 dnsmasq-dhcp[8246]: read /etc/dhcphosts.txt > Oct 30 16:23:02 dnsmasq-dhcp[8246]: read /etc/dhcpopts.txt > Oct 30 16:23:02 dnsmasq-dhcp[8246]: not giving name BatVM2.batvpc.net to the > DHCP lease of 10.3.1.162 because the name exists in /etc/hosts with address > 10.3.2.111 > Oct 30 16:23:02 dnsmasq-dhcp[8246]: not giving name BatVM2 to the DHCP lease > of 10.3.1.162 because the name exists in /etc/hosts with address 10.3.2.111 > In other words the Tier 2 address has taken precedence over the initial Tier > 1 address. > The /etc/dhcphosts.txt file has now lost the Tier 1 entry and now contains: > root@r-20-VM:~# cat /etc/dhcphosts.txt > 02:00:26:94:00:06,set:10_3_2_111,10.3.2.111,BatVM2,infinite > 5) When restarting the VM it will fail to get a DHCP lease on eth0. > Note: in some cases it will reuse the old lease which is cached in the local > leases database - note this IP lease does not come from the VPC VR. > The dnsmasq log will now display the following: > Oct 30 16:30:36 dnsmasq-dhcp[8246]: DHCPREQUEST(eth2) 10.3.1.162 > 02:00:21:fd:00:08 > Oct 30 16:30:36 dnsmasq-dhcp[8246]: DHCPNAK(eth2) 10.3.1.162 > 02:00:21:fd:00:08 address not available > Oct 30 16:30:44 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:30:58 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:31:13 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:31:22 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:31:32 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth3) 02:00:26:94:00:06 > Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPOFFER(eth3) 10.3.2.111 > 02:00:26:94:00:06 > Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPREQUEST(eth3) 10.3.2.111 > 02:00:26:94:00:06 > Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPACK(eth3) 10.3.2.111 > 02:00:26:94:00:06 BatVM2 > I.e. the VM is not receiving a DHCP offer on eth0 as there are no addresses > configured, however eth1 successfully handshakes. > 6) Note - restart of the VPC VR / restart of network with cleanup does not > seem to fix the issue. > 7) Just removing the last added NIC does not fix the issue: > The DHCP host file still contains the following, i.e. the host entry from the > last added NIC: > root@r-20-VM:~# cat /etc/dhcphosts.txt > 02:00:26:94:00:06,set:10_3_2_111,10.3.2.111,BatVM2,infinite > root@r-20-VM:~# > Restarting the VM after removal will show: > Oct 30 16:42:00 dnsmasq-dhcp[8246]: DHCPREQUEST(eth2) 10.3.1.162 > 02:00:21:fd:00:08 > Oct 30 16:42:00 dnsmasq-dhcp[8246]: DHCPNAK(eth2) 10.3.1.162 > 02:00:21:fd:00:08 address not available > Oct 30 16:42:08 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:42:19 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:42:30 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > Oct 30 16:42:50 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no > address available > I.e. still no DHCP lease on Tier 1. > 8) Getting DHCP to work again on the guest VM eth0 involves juggling NICs: > - Making the last added NIC (eth1) primary. > - Remove the first NIC (eth0) as discussed in step 7 above. > - Readding a new NIC on Tier1. > - At this point DHCP will work on the Tier 1 NIC, but will be broken on > the Tier 2 NIC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)