We thought about this possibility and truly checked every mac and every ip 
everything on those 3 switches and everything is clean, we even setup a syslog 
server and started logging everything from those 3 switches. Thats why we 
configured a physical box with same IP address (after shutting down VR) and 
checked mac address tables on those switches and we even had stp setup for 
redundancy we considered this possibility and removed redundancy and stopped 
stp removed reduntant cabling to see if it would help nothing so far pointing 
to network setup or switch ports or ip address, we even placed the xenserver on 
the very switch that internet was connected and still the same problem of 
waiting 1-3 hours before it starts pinging...

One thing we realized during those checks was one of our management router, we 
were using with DHCP (same management network but ip ranges for cs and from our 
router was different) and we realized this was a big mistake since cloudstack 
was using its own DHCP on the management network and we disabled that, it is 
not really related to our public network and it was being used to issue dhcp ip 
address to wireless clients (our ipads, laptops etc.) and we thought different 
ip ranges would be ok but it wasn't so we disabled it....just a side note.

I would like to also mention one other thing : When we first setup we 
immediately checked other system vms (this is our second or third setup due to 
this public ip problem) SSVM came up and started pinging right away, then we 
checked Console proxy and it hesitated for almost 10 minutes and started 
pinging but it was quick, we have this problem of 1-3 hours timeframe with only 
routers. In previous setups we also seen SSVM and Console proxy hesitating for 
sometime but came up quicker then routers.

This iptables -L being very slow to respond when it is not working is the very 
significant symptom we found so far, we just sit for few hours and issue 
iptables -L and if it responds very fast then we know we can ping the 
gateway...and sure enough we can....

Thanks,

Sam Ceylani, MBA
Computer Engineer
MisterCertified Inc.

301 W. Platt St. Suite 447, Tampa, FL 33606<x-apple-data-detectors://0/0>
P 813<tel:813.264.6460>.264.6460<tel:813.264.6460> M 
813<tel:813.416.7867>.416.7867<tel:813.416.7867>
F 800<tel:800.553.9520>.553.9520<tel:800.553.9520> E 
[email protected]<mailto:[email protected]>

On Oct 28, 2014, at 11:29 PM, "Philippe Bechamp" 
<[email protected]<mailto:[email protected]>> 
wrote:

Hi,

Have you considered an IP address conflict ?

arping could help you track this down if that could be the case.

No hard data but instinct and internettance scream IP conflict in my brain !

Good luck !

Phil.
--
Phil Bechamp | Director of Online Operations
+1.514.812.9609 ext. 222






________________________________________
From: Sam Ceylani [[email protected]<mailto:[email protected]>]
Sent: Tuesday, October 28, 2014 11:19 PM
To: [email protected]<mailto:[email protected]>
Subject: VR not able to ping public gateway for almost 3 hours then it WORKS...

VR not able to ping public gateway for almost 3 hours then it works.


Cloudstack 4.4.1 (new install) and Xenserver 6.2, public and management 
networks are not tagged and using vlan1. For some reason when VR is created its 
not able communicate with its public gateway for almost 2-3 hours and all of a 
sudden it starts pinging. After it starts pinging then restarting VR etc. is 
not a problem and it starts working as soon as it comes up but problem happens 
again when router is destroyed and created again and we have this same problem 
of not being able ping gateway for sometime, it takes 30-45 minutes to starts 
working again sometimes 2-3 hours. We have 3 HP switches and first one is 
connected to internet gateway and through untagged ports  on those 2 other 
switches (through trunk port) xenserver hosts connected via (active-passive) 
bonds. Iscsi (primary storage, vlan 30,31,32,33) nfs (secondary storage vlan 
34), guest (500-550) public (vlan1) management (vlan1). We logon to virtual 
router and issue iptables -L and response is very slow (when it starts working 
response is very fast) we tried to traceroute gateway ip and response is very 
fast blank * * * displayed for all those 30 hops. ifconfig -a displays all the 
right information for network interfaces.We tried to remove and reinsert egress 
rule (ALL) back but that didn't help we would still have to wait for few hours 
for router to start pinging again. We tried to use this same IP on a physical 
machine connected to this same switch on an untagged port and it works as soon 
as we configure this same IP. We can ping this VR from outside and it responds 
OK so we know that network configuration is OK, We are thinking about firewall 
rules not downloading in a timely manner but we checked /var/log/cloud.log file 
on the router but there is really no change before and after (pinging) so we 
really don't know how to troubleshoot this problem any further...

If requested, I can upload cloud.log file from VR, we compared this log file 
with a working one (VR) and no difference between them,

Template file and CS 4.4.1 downloaded around Oct 6,

I know it is hard to troubleshoot this kind of issue but if you can point me to 
possible causes that will be perfect so we can start from somewhere to 
troubleshoot this problem,

When we used tcpdump on the router we realized that before it starts working we 
have more stuff displayed (conversations about almost every network activity on 
the switch) and when it starts working almost %60 reduction in tcp 
conversations from all interfaces on the router...

Thanks,

Sam

Reply via email to