Yes, by default the ARP timers in their gateways are set very long. If a VM is destroyed and a new VM picks up the old IP (but with a new mac) the router continues to send traffic to the old mac. The fix is to set the arp timeout very low.
On 12/10/12 10:46 AM, "Chip Childers" <chip.child...@sungard.com> wrote: >On Mon, Dec 3, 2012 at 9:29 PM, Chiradeep Vittal ><chiradeep.vit...@citrix.com> wrote: >> Is there an Arista switch in the path by any chance? > >Out of curiosity, why do you ask Chiradeep? Have you run into >oddities with them? > >> On 12/3/12 2:44 PM, "Anthony Xu" <xuefei...@citrix.com> wrote: >> >>>> What is the point of adding this extra route - if external routing >>>> handles that by default? >>>The extra route is added to make sure management server can talk to >>>route >>>VM. >>> >>> >>>Can you share your setup info? >>>Zone setup, network type, Private IP range, public IP range, VLAN info. >>> >>> >>>This issue might happen when private ip CIDR overlaps with public ip >>>CIDR, CS might not check this case. >>> >>>Anthony >>> >>> >>> >>>> -----Original Message----- >>>> From: Musayev, Ilya [mailto:imusa...@webmd.net] >>>> Sent: Monday, December 03, 2012 2:15 PM >>>> To: cloudstack-dev@incubator.apache.org >>>> Subject: RE: Router VM and Network Issue >>>> >>>> Let me retract this comment for now and do more thorough testing.. It >>>> appears it fixes the issue on 1 type of network and breaks on >>>>another... >>>> >>>> -----Original Message----- >>>> From: Musayev, Ilya [mailto:imusa...@webmd.net] >>>> Sent: Monday, December 03, 2012 4:43 PM >>>> To: cloudstack-dev@incubator.apache.org >>>> Subject: RE: Router VM and Network Issue >>>> >>>> Anthony, >>>> >>>> I do have the code below, but my fix was to remove the extra route >>>>that >>>> is added by command >>>> >>>> "ip route add $MGMTNET via $LOCAL_GW dev eth1" from cloud-early-config >>>> >>>> Once I commented that part out - everything is working fine.. >>>> >>>> What is the point of adding this extra route - if external routing >>>> handles that by default? >>>> >>>> In my case - the "route add" created ARP issues. >>>> >>>> Thank you for very helpful feedback >>>> -ilya >>>> >>>> >>>> -----Original Message----- >>>> From: Anthony Xu [mailto:xuefei...@citrix.com] >>>> Sent: Monday, December 03, 2012 4:08 PM >>>> To: cloudstack-dev@incubator.apache.org >>>> Subject: RE: Router VM and Network Issue >>>> >>>> I checked the code >>>> in ./patches/systemvm/debian/config/etc/init.d/cloud-early-config >>>> >>>> # a hacking way to activate vSwitch under VMware >>>> ping -n -c 3 $GW & >>>> sleep 3 >>>> pkill ping >>>> if [ -n "$MGMTNET" -a -n "$LOCAL_GW" ] >>>> then >>>> ping -n -c 3 $LOCAL_GW & >>>> sleep 3 >>>> pkill ping >>>> fi >>>> >>>> It pings both local and public gateway, >>>> >>>> Could you check the file in your setup to see if the fix is in? >>>> >>>> Below is the procedure to fix the issue in your setup, >>>> >>>> 1.Check in the fix >>>> 2. get the latest build >>>> 3. upgrade your current setup >>>> 4. stop/start all system VM (SSVM , CPVM, router VM) >>>> >>>> >>>> Anthony >>>> >>>> >>>> >>>> >>>> >>>> > -----Original Message----- >>>> > From: Musayev, Ilya [mailto:imusa...@webmd.net] >>>> > Sent: Monday, December 03, 2012 12:37 PM >>>> > To: cloudstack-dev@incubator.apache.org >>>> > Subject: RE: Router VM and Network Issue >>>> > >>>> > Anthony, >>>> > >>>> > It does ping the local gateways and I can see that happening when >>>> > router VM boots up. >>>> > >>>> > But the fix is to ping either CS Core or CS gateway - that truly >>>> > addresses the issue. >>>> > >>>> > Any thoughts of how I can create this behavior in reproducible >>>> fashion >>>> > for all new routers? >>>> > >>>> > Thanks >>>> > ilya >>>> > >>>> > >>>> > >>>> > -----Original Message----- >>>> > From: Anthony Xu [mailto:xuefei...@citrix.com] >>>> > Sent: Monday, December 03, 2012 3:12 PM >>>> > To: cloudstack-dev@incubator.apache.org >>>> > Subject: RE: Router VM and Network Issue >>>> > >>>> > I remember we have a fix for this, when route VM boots up, it tries >>>> to >>>> > ping default gateway to propagate its MAC to switch. >>>> > >>>> > Might be this fix is not checked into CS 4.0 >>>> > >>>> > Anthony >>>> > >>>> > > -----Original Message----- >>>> > > From: Musayev, Ilya [mailto:imusa...@webmd.net] >>>> > > Sent: Monday, December 03, 2012 10:59 AM >>>> > > To: cloudstack-dev@incubator.apache.org >>>> > > Subject: Router VM and Network Issue >>>> > > >>>> > > So I hit a glitch where a router VM boots up but does not really >>>> > > pass any traffic unless I ping the gateway of the CS host from >>>> > > within the router VM. >>>> > > >>>> > > Once the gateway ping goes through, CS is able to SSH into a >>>>router >>>> > VM >>>> > > and everything is fine and dandy.. >>>> > > >>>> > > But this behavior really puzzles me. Linux network stack is not >>>> > > fully activated or routing is not fully functional until the >>>> initial >>>> > > CS GW ping. >>>> > > >>>> > > Also I cant ping/ssh the router VM from CS unless a initiate a >>>>ping >>>> > > from within the router VM. >>>> > > >>>> > > I'm on CS 4.0 and vSphere5. This seem to affect the Advanced >>>> Network >>>> > > setup more than Basic because of routing complexity - as you add >>>> > > some routes into linux routing table. >>>> > > >>>> > > >>>> > > Has anyone seen this before? >>>> > >>>> >>>> >>>> >>>> >>> >> >>