Ok, So from what I could tell, both the routers were blocking ICMP on the WAN
port.
Going through the management ports I was able to enable a feature so both
routers can be ping'ed on their WAN interface.
I also messed around with the ping command to see if packets of certain sizes
could be sent/received. The results were as follows:
>From To Packet Size Result
Internet Root Router 1473 Reply
Internet Root Router 1507 Reply
LAN 1 Internet 1473 Reply
LAN 1 Internet 1507 Reply
LAN 2 (XenServer) Internet 1473 Reply
LAN 2 (XenServer) Internet 1507 Reply
LAN 2 (SSVM) Internet 1400 Reply
LAN 2 (SSVM) Internet 1473 Failure
LAN 2 (SSVM)
Internet 1507 Failure
LAN 1 2nd Router 1473 Reply
LAN 1
2nd Router
1507 Reply
Based on this information I think the problem is at the SSVM level. I am not
sure why the SSVM / Hypervisor level. I am not sure why the hypervisor hosting
the VM is able to receive a ping of size 1507 but the VM it is hosting cannot.
Why would the MTU be different for the VM?
My second question is why is the PMTUD protocol not working. If all the points
along the hop are transmitting pings why can the wget command not resize frames?
I am stumped!
Thanks,
Taylor
________________________________
From: Taylor <[email protected]>
Sent: Wednesday, July 19, 2017 12:23 PM
To: [email protected]
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
I reread the wiki article and some other google articles.
I think I understand the ICMP issue now: If the router is blocking all ICMP
then it will not receive the (ICMP) Fragmentation Needed (Type 3, Code 4)
message containing the MTU of the other node on the network with the smaller
MTU.
Going back to my last question: I think the problem is that ICMP is blocked on
the WAN port of the router?? I think that would prevent the NAT traversal of
the ICMP. I am thinking it is enabled on the LAN port because I can ping
google. Is this correct thinking?
I am looking up how to spoof ICMP messages to debug this further.
Thanks,
Taylor
________________________________
From: Taylor <[email protected]>
Sent: Wednesday, July 19, 2017 11:36 AM
To: [email protected]
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Si,
Thanks for explaining that. Yes, it makes sense.
My two routers are netgear and dd-wrt.
Does the ICMP need to be enabled on both or just the NAT'd router (dd-wrt) ?
I am looking into the router settings / config pages now to get more familiar
with what options are available.
Thanks,
Taylor
________________________________
From: Simon Weller <[email protected]>
Sent: Wednesday, July 19, 2017 11:03 AM
To: [email protected]
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
MTU path discovery uses ICMP type 3 code 4 messages. If your routers are
blocking all ICMP inbound from the internet, they will never received the
messages and won't know that an upstream router needs the packet to be
re-transmitted with a smaller MTU size.
Does that make sense?
- Si
________________________________
From: Taylor <[email protected]>
Sent: Wednesday, July 19, 2017 9:37 AM
To: [email protected]
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
I don't think ICMP is being blocked. I can ping google.com from inside the
NAT'd LAN.
Am I misunderstanding what you said?
________________________________
From: Simon Weller <[email protected]>
Sent: Wednesday, July 19, 2017 10:04 AM
To: [email protected]
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
So if your routers are blocking all ICMP, they will break MTU path discovery.
See this: https://en.wikipedia.org/wiki/Path_MTU_Discovery
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking
for determining the maximum transmission unit (MTU) size on the network path
between two ...
________________________________
From: Taylor <[email protected]>
Sent: Wednesday, July 19, 2017 8:57 AM
To: [email protected]
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Hi Simon,
I a not sure about the networking problem you mentioned. I will google that and
if you have any quick ways to check let me know.
As for the double NAT, the answer is yes. My network is configured as follows:
Internet -- 1000MBps router -- 100Mbps router -- Hypervisor / Cloudstack / NFS
As far as I am aware the routers are acting as firewalls for incoming traffic
(they are simple home routers) but should not impact outgoing traffic.
Thanks,
Taylor
________________________________
From: Simon Weller <[email protected]>
Sent: Wednesday, July 19, 2017 9:41 AM
To: [email protected]
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Taylor,
To me this sounds like you might have some sort of networking problem, such as
MTU path discovery being broken and some device in the path setting a do not
fragment flag.
Can you give us a bit more info about how you are connected to the internet as
Dag has suggested? Is there a firewall in front of your switch? Are you double
NATing the traffic?
- Si
________________________________
From: Taylor <[email protected]>
Sent: Wednesday, July 19, 2017 8:09 AM
To: [email protected]
Cc: Taylor
Subject: RE: Cloudstack 4.9 CentOS Template failure - Connection Reset
forgot to cc myself
-------- Original message --------
From: Taylor <[email protected]>
Date: 7/19/17 08:08 (GMT-06:00)
To: [email protected]
Subject: RE: Cloudstack 4.9 CentOS Template failure - Connection Reset
Hey Dag,
I have tried both. The health check is good and the vm behavior does not change
after recreation.
I think the problem is the network latency or the vm's resource allocation.
The download will work but a connection timeout occurs every 20MB so it needs
to be done in pieces. The hypervisors which hosts the vm is able to download
without a problem.
I am on a 100mbps switch. In the past I was using a 1000mbps.
Any other thoughts on debug or work around?
I think adding retry logic should be a simple fix?
-------- Original message --------
From: Dag Sonstebo <[email protected]>
Date: 7/19/17 03:11 (GMT-06:00)
To: [email protected]
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Hi Taylor,
This is most likely an issue with your environment rather than a bug. Take a
look at your public network and how that is connected to the internet. You have
to let CloudStack pull down the template, it’s difficult to manually populate
this.
A couple of other things to try:
- recreate the SSVM – simply delete it and CloudStack will generate a new one.
- from internally in the SSVM you can also run the SSVM check script, which
will do some basic health checks for you:
root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh
================================================
First DNS server is 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 48 data bytes
56 bytes from 8.8.8.8: icmp_seq=0 ttl=53 time=24.146 ms
56 bytes from 8.8.8.8: icmp_seq=1 ttl=53 time=22.320 ms
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 22.320/23.233/24.146/0.913 ms
Good: Can ping DNS server
================================================
Good: DNS resolves download.cloud.com
================================================
nfs is currently mounted
Mount point is /mnt/SecStorage/a833f5f1-1c6d-3e54-9a55-1fc9b7875c54
Good: Can write to mount point
================================================
Management server is 10.10.45.2. Checking connectivity.
Good: Can connect to management server port 8250
================================================
Good: Java process is running
================================================
Tests Complete. Look for ERROR or WARNING above.
Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue
On 19/07/2017, 05:29, "Taylor" <[email protected]> wrote:
Hello,
I am experiencing an issue while trying to download the CentOS template.
It seems the connection is timing out and then failing.
To debug I logged into the SSVM and tried running a wget from the nfs
directory mounted on that vm. This also failed due to connection reset.
Wget will eventually succeed if i use retry logic as follows:
wget http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
--read-timeout=10
Using this setting the download will complete in pieces after multiple
timeouts (logs below).
Is there a work around to add retry logic? Can I manually download and add
the template to the database and restart the service? How can I file a bug
report?
Thanks,
Taylor
=============================================================================
LOGS:
=============================================================================
root@s-103-VM:/mnt/SecStorage/5d8e791e-01cc-3d7c-84d8-f469944056e0/template/tmpl/1/5#
wget http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
--read-timeout=10
--2017-07-19 03:25:10--
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Resolving download.cloud.com (download.cloud.com)... 54.231.81.40
Connecting to download.cloud.com (download.cloud.com)|54.231.81.40|:80...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 374730926 (357M) [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
19% [============================>
]
74,734,204 --.-K/s in 97s
2017-07-19 03:26:55 (756 KB/s) - Read error at byte 74734204/374730926
(Connection timed out). Retrying.
--2017-07-19 03:26:56-- (try: 2)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|54.231.81.40|:80...
failed: Connection timed out.
Resolving download.cloud.com (download.cloud.com)... 52.216.81.16
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 299996722 (286M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
24% [+++++++++++++++++++++++++++++======>
]
93,094,824 --.-K/s in 24s
2017-07-19 03:28:23 (740 KB/s) - Read error at byte 93094824/374730926
(Connection timed out). Retrying.
--2017-07-19 03:28:25-- (try: 3)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 281636102 (269M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
29% [++++++++++++++++++++++++++++++++++++======>
]
112,111,876 --.-K/s in 26s
2017-07-19 03:29:23 (702 KB/s) - Read error at byte 112111876/374730926
(Connection timed out). Retrying.
--2017-07-19 03:29:26-- (try: 4)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 262619050 (250M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
34% [+++++++++++++++++++++++++++++++++++++++++++=======>
]
130,790,615 --.-K/s in 26s
2017-07-19 03:30:23 (712 KB/s) - Read error at byte 130790615/374730926
(Connection timed out). Retrying.
--2017-07-19 03:30:27-- (try: 5)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 243940311 (233M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
49%
[+++++++++++++++++++++++++++++++++++++++++++++++++++====================>
]
185,802,362 --.-K/s in 49s
2017-07-19 03:31:23 (1.08 MB/s) - Read error at byte 185802362/374730926
(Connection timed out). Retrying.
--2017-07-19 03:31:28-- (try: 6)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80...
failed: Connection timed out.
Resolving download.cloud.com (download.cloud.com)... 52.216.225.48
Connecting to download.cloud.com (download.cloud.com)|52.216.225.48|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 188928564 (180M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
60%
[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++================>
] 226,947,877 --.-K/s
in 32s
2017-07-19 03:33:11 (1.21 MB/s) - Read error at byte 226947877/374730926
(Connection timed out). Retrying.
--2017-07-19 03:33:17-- (try: 7)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.225.48|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 147783049 (141M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
65%
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++======>
] 247,259,880 --.-K/s in
28s
2017-07-19 03:33:46 (697 KB/s) - Read error at byte 247259880/374730926
(Connection timed out). Retrying.
--2017-07-19 03:33:53-- (try: 8)
http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.225.48|:80...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 127471046 (122M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
100%[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++==================================================>]
374,730,926 1010K/s in 2m 10s
[email protected]
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
framework developed by ShapeBlue to deliver the rapid deployment of a
standardised ...
53 Chandos Place, Covent Garden, London WC2N 4HSUK
@shapeblue