Hi, 

I've created an entry in Jira for this issue: 
http://bugs.cloudstack.org/browse/CS-15308
dumps for all the VIFs of the SSVM are attached to the Jira entry itself.

An Extract from the traces, obtained while trying to ping the address of 
secondary  storage NFS share, is reported below.
Please note that the secondary storage NFS share is on a distinct network, and 
the gateway configured for the pod (192.168.0.1) NATs to this network. 

eth1

11:49:29.082189 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.103 tell 192.168.0.1, length 46
11:49:29.082202 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.3.103 is-at 
06:c1:7c:00:00:04, length 28
11:49:29.226796 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46
11:49:30.227001 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46

eth2

11:49:25.218750 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP 
(1), length 84)
    192.168.3.110 > 10.70.177.15: ICMP echo request, id 33826, seq 1280, length 
64
11:49:25.220138 IP (tos 0x0, ttl 63, id 3370, offset 0, flags [none], proto 
ICMP (1), length 84)
    10.70.177.15 > 192.168.3.110: ICMP echo reply, id 33826, seq 1280, length 64
11:49:25.225618 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46
11:49:26.220070 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP 
(1), length 84)
    192.168.3.110 > 10.70.177.15: ICMP echo request, id 33826, seq 1536, length 
64
11:49:26.221735 IP (tos 0x0, ttl 63, id 3371, offset 0, flags [none], proto 
ICMP (1), length 84)
    10.70.177.15 > 192.168.3.110: ICMP echo reply, id 33826, seq 1536, length 64
11:49:26.225690 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46
11:49:27.221391 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP 
(1), length 84)
    192.168.3.110 > 10.70.177.15: ICMP echo request, id 33826, seq 1792, length 
64
11:49:27.222894 IP (tos 0x0, ttl 63, id 3372, offset 0, flags [none], proto 
ICMP (1), length 84)
    10.70.177.15 > 192.168.3.110: ICMP echo reply, id 33826, seq 1792, length 64
11:49:27.225831 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46
11:49:28.222711 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP 
(1), length 84)
    192.168.3.110 > 10.70.177.15: ICMP echo request, id 33826, seq 2048, length 
64
11:49:28.225105 IP (tos 0x0, ttl 63, id 3373, offset 0, flags [none], proto 
ICMP (1), length 84)
    10.70.177.15 > 192.168.3.110: ICMP echo reply, id 33826, seq 2048, length 64
11:49:29.224031 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP 
(1), length 84)
    192.168.3.110 > 10.70.177.15: ICMP echo request, id 33826, seq 2304, length 
64
11:49:29.226845 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46
11:49:30.225353 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP 
(1), length 84)
    192.168.3.110 > 10.70.177.15: ICMP echo request, id 33826, seq 2560, length 
64
11:49:30.227048 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46

eth3

11:49:29.226817 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46
11:49:29.687318 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.9.10 tell 192.168.9.12, length 46
11:49:30.227018 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
192.168.3.110 tell 192.168.0.1, length 46

> -----Original Message-----
> From: Anthony Xu [mailto:xuefei...@citrix.com]
> Sent: 15 June 2012 20:58
> To: cloudstack-dev@incubator.apache.org
> Subject: RE: Advice on SSVM network interfaces
> 
> Salvatore,
> 
> I guess the attachment is discarded by mail server, could you please post the
> content in mail?
> 
> Please file a bug for this.
> 
> 
> Regards,
> Anthony
> 
> > -----Original Message-----
> > From: Salvatore Orlando [mailto:salvatore.orla...@eu.citrix.com]
> > Sent: Friday, June 15, 2012 12:49 PM
> > To: cloudstack-dev@incubator.apache.org
> > Subject: RE: Advice on SSVM network interfaces
> >
> > Anthony,
> >
> > At first I too was suspecting the ARP reply was sent on eth1 and then
> > discarded by the gateway because the SRC MAC address of the frame did
> > not match the MAC in the ARP payload.
> > However, I've not seen any ARP reply on eth1. I'm attaching the dumps
> > so you can have a look at them (I don't know whether they'll be
> > mangled or not, though).
> >
> > I agree this might be a bug. My perspective is that there's no reason
> > for having several network interfaces unless distinct labels are being
> > used for management/public/storage network.
> > Is it ok for you if I report an issue on bugs.cloudstack.org and
> > assign it to you?
> >
> > Regards,
> > Salvatore
> >
> > > -----Original Message-----
> > > From: Anthony Xu [mailto:xuefei...@citrix.com]
> > > Sent: 15 June 2012 19:59
> > > To: cloudstack-dev@incubator.apache.org
> > > Subject: RE: Advice on SSVM network interfaces
> > >
> > > Hi Salvatore,
> > >
> > > From your description, the ARP response is sent out through eth1,
> > some
> > > switches may drop this kind of package, it expects to receive ARP
> > response
> > > from the same port ARP request sent out.
> > >
> > > I think it is a bug, in this case, CloudStack should not configure
> > eth2 and eth3
> > > for SSVM, if SSVM does need several IPs, all IPs should be
> > > configured
> > on eth1
> > > if they are in the same subnet.
> > >
> > >
> > >
> > > Regards,
> > > Anthony
> > >
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Salvatore Orlando [mailto:salvatore.orla...@eu.citrix.com]
> > > > Sent: Friday, June 15, 2012 11:00 AM
> > > > To: cloudstack-dev@incubator.apache.org
> > > > Subject: Advice on SSVM network interfaces
> > > >
> > > > Hi,
> > > >
> > > > We have a test environment where a basic zone is deployed.
> > > > Both system VMs and guest addresses are in the 192.168.0.0/16
> > subnet,
> > > > even if with distinct IP ranges.
> > > >
> > > > We noticed that the SSVM is unable to download templates, as the
> > > > connection over the public interface (eth2) is suddenly dropped
> > (see
> > > > attached dump).
> > > > As it can be seen from the dump the connection drops because the
> > SSVM
> > > > fails to answer to ARP requests from the gateway on eth2.
> > > > ARP requests sent to eth2's address fail also from other machines
> > in
> > > > the same network.
> > > >
> > > > Here are the relevant configuration info from the SSVM:
> > > >
> > > > root@s-3-VM:~# ip addr show
> > > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state
> > > UNKNOWN
> > > >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > > >     inet 127.0.0.1/8 scope host lo
> > > >     inet6 ::1/128 scope host
> > > >        valid_lft forever preferred_lft forever
> > > > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > pfifo_fast
> > > > state UP qlen 1000
> > > >     link/ether 0e:00:a9:fe:02:5c brd ff:ff:ff:ff:ff:ff
> > > >     inet 169.254.2.92/16 brd 169.254.255.255 scope global eth0
> > > >     inet6 fe80::c00:a9ff:fefe:25c/64 scope link
> > > >        valid_lft forever preferred_lft forever
> > > > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > pfifo_fast
> > > > state UP qlen 1000
> > > >     link/ether 06:de:1e:00:00:03 brd ff:ff:ff:ff:ff:ff
> > > >     inet 192.168.3.102/16 brd 192.168.255.255 scope global eth1
> > > >     inet6 fe80::4de:1eff:fe00:3/64 scope link
> > > >        valid_lft forever preferred_lft forever
> > > > 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > pfifo_fast
> > > > state UP qlen 1000
> > > >     link/ether 06:5d:f6:00:00:0b brd ff:ff:ff:ff:ff:ff
> > > >     inet 192.168.3.110/16 brd 192.168.255.255 scope global eth2
> > > >     inet6 fe80::45d:f6ff:fe00:b/64 scope link
> > > >        valid_lft forever preferred_lft forever
> > > > 5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > pfifo_fast
> > > > state UP qlen 1000
> > > >     link/ether 06:93:c6:00:00:04 brd ff:ff:ff:ff:ff:ff
> > > >     inet 192.168.3.103/16 brd 192.168.255.255 scope global eth3
> > > >     inet6 fe80::493:c6ff:fe00:4/64 scope link
> > > >        valid_lft forever preferred_lft forever
> > > >
> > > >
> > > > root@s-3-VM:~# ip route
> > > > 169.254.0.0/16 dev eth0  proto kernel  scope link  src
> > > > 169.254.2.92
> > > > 192.168.0.0/16 dev eth1  proto kernel  scope link  src
> > 192.168.3.102
> > > > 192.168.0.0/16 dev eth2  proto kernel  scope link  src
> > 192.168.3.110
> > > > 192.168.0.0/16 dev eth3  proto kernel  scope link  src
> > 192.168.3.103
> > > > default via 192.168.0.1 dev eth2
> > > >
> > > >
> > > > root@s-3-VM:~# sysctl -a | grep ipv4.conf.*.arp
> > > > error: permission denied on key 'net.ipv4.route.flush'
> > > > net.ipv4.conf.all.proxy_arp = 0
> > > > net.ipv4.conf.all.arp_filter = 0
> > > > net.ipv4.conf.all.arp_announce = 2 net.ipv4.conf.all.arp_ignore =
> > > > 2 net.ipv4.conf.all.arp_accept = 0 net.ipv4.conf.all.arp_notify =
> > > > 0 net.ipv4.conf.default.proxy_arp = 0
> > > > net.ipv4.conf.default.arp_filter = 0
> > > > net.ipv4.conf.default.arp_announce = 2
> > > > net.ipv4.conf.default.arp_ignore = 2
> > net.ipv4.conf.default.arp_accept
> > > > = 0 net.ipv4.conf.default.arp_notify = 0
> > > > net.ipv4.conf.lo.proxy_arp
> > =
> > > > 0 net.ipv4.conf.lo.arp_filter = 0 net.ipv4.conf.lo.arp_announce =
> > > > 2 net.ipv4.conf.lo.arp_ignore = 2 net.ipv4.conf.lo.arp_accept = 0
> > > > net.ipv4.conf.lo.arp_notify = 0 net.ipv4.conf.eth0.proxy_arp = 0
> > > > net.ipv4.conf.eth0.arp_filter = 0 net.ipv4.conf.eth0.arp_announce
> > > > =
> > 2
> > > > net.ipv4.conf.eth0.arp_ignore = 2 net.ipv4.conf.eth0.arp_accept =
> > > > 0 net.ipv4.conf.eth0.arp_notify = 0 net.ipv4.conf.eth1.proxy_arp =
> > > > 0 net.ipv4.conf.eth1.arp_filter = 0
> > > > net.ipv4.conf.eth1.arp_announce =
> > 2
> > > > net.ipv4.conf.eth1.arp_ignore = 2 net.ipv4.conf.eth1.arp_accept =
> > > > 0 net.ipv4.conf.eth1.arp_notify = 0 net.ipv4.conf.eth2.proxy_arp =
> > > > 0 net.ipv4.conf.eth2.arp_filter = 0
> > > > net.ipv4.conf.eth2.arp_announce =
> > 2
> > > > net.ipv4.conf.eth2.arp_ignore = 2 net.ipv4.conf.eth2.arp_accept =
> > > > 0 net.ipv4.conf.eth2.arp_notify = 0 net.ipv4.conf.eth3.proxy_arp =
> > > > 0 net.ipv4.conf.eth3.arp_filter = 0
> > > > net.ipv4.conf.eth3.arp_announce =
> > 2
> > > > net.ipv4.conf.eth3.arp_ignore = 2 net.ipv4.conf.eth3.arp_accept =
> > > > 0 net.ipv4.conf.eth3.arp_notify = 0
> > > >
> > > > The behaviour actually is exactly the same one would expect if
> > > > arp_filter is enabled on the interfaces, but the flag is clearly
> > set
> > > > to 0. Also setting arp_ignore to 0 does not cause the expected arp
> > > > flux problem, as replies are sent only from the first virtual
> > > > interface (eth1). In a way, it looks like as there are policies
> > > > enforced through arptables, but it seems the module is not loaded,
> > nor
> > > > the userspace utility is available on the SSVM.
> > > >
> > > > Of course, changing the order in the route table as follows, ie
> > > > putting
> > > > eth2 before eth1 for 192.168.0.0/16, solves the issue.
> > > >
> > > > 169.254.0.0/16 dev eth0  proto kernel  scope link  src
> > > > 169.254.2.92
> > > > 192.168.0.0/16 dev eth2  proto kernel  scope link  src
> > 192.168.3.110
> > > > 192.168.0.0/16 dev eth1  proto kernel  scope link  src
> > 192.168.3.102
> > > > 192.168.0.0/16 dev eth3  proto kernel  scope link  src
> > 192.168.3.103
> > > > default via 192.168.0.1 dev eth2
> > > >
> > > > Quite interestingly, after this change ARP requests to eth2 are
> > > > honoured by the SSVM even after it is rebooted, and even if the
> > > > relevant ARP cache entry in the gateway is removed. Of course,
> > > > this
> > is
> > > > not the case when the SSVM is destroyed, as the new SSVM will have
> > a
> > > > different MAC address for every interface.
> > > >
> > > > It is also interesting noting that in another setup, where we
> > > > configured an advanced zone, this problem does not occur. Even the
> > > > SSVM is deployed in an adv zone, the network configuration of the
> > SSVM
> > > > is very similar.
> > > >
> > > > It would be great if you can provide some advice for debugging
> > > > this issue, or share similar experiences.
> > > >
> > > > Regards,
> > > > Salvatore

Reply via email to