Hi, 

We have a test environment where a basic zone is deployed.
Both system VMs and guest addresses are in the 192.168.0.0/16 subnet, even if 
with distinct IP ranges.

We noticed that the SSVM is unable to download templates, as the connection 
over the public interface (eth2) is suddenly dropped (see attached dump).
As it can be seen from the dump the connection drops because the SSVM fails to 
answer to ARP requests from the gateway on eth2.
ARP requests sent to eth2's address fail also from other machines in the same 
network.

Here are the relevant configuration info from the SSVM:

root@s-3-VM:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
qlen 1000
    link/ether 0e:00:a9:fe:02:5c brd ff:ff:ff:ff:ff:ff
    inet 169.254.2.92/16 brd 169.254.255.255 scope global eth0
    inet6 fe80::c00:a9ff:fefe:25c/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
qlen 1000
    link/ether 06:de:1e:00:00:03 brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.102/16 brd 192.168.255.255 scope global eth1
    inet6 fe80::4de:1eff:fe00:3/64 scope link
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
qlen 1000
    link/ether 06:5d:f6:00:00:0b brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.110/16 brd 192.168.255.255 scope global eth2
    inet6 fe80::45d:f6ff:fe00:b/64 scope link
       valid_lft forever preferred_lft forever
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
qlen 1000
    link/ether 06:93:c6:00:00:04 brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.103/16 brd 192.168.255.255 scope global eth3
    inet6 fe80::493:c6ff:fe00:4/64 scope link
       valid_lft forever preferred_lft forever


root@s-3-VM:~# ip route
169.254.0.0/16 dev eth0  proto kernel  scope link  src 169.254.2.92
192.168.0.0/16 dev eth1  proto kernel  scope link  src 192.168.3.102
192.168.0.0/16 dev eth2  proto kernel  scope link  src 192.168.3.110
192.168.0.0/16 dev eth3  proto kernel  scope link  src 192.168.3.103
default via 192.168.0.1 dev eth2 


root@s-3-VM:~# sysctl -a | grep ipv4.conf.*.arp
error: permission denied on key 'net.ipv4.route.flush'
net.ipv4.conf.all.proxy_arp = 0
net.ipv4.conf.all.arp_filter = 0
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 2
net.ipv4.conf.all.arp_accept = 0
net.ipv4.conf.all.arp_notify = 0
net.ipv4.conf.default.proxy_arp = 0
net.ipv4.conf.default.arp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.default.arp_ignore = 2
net.ipv4.conf.default.arp_accept = 0
net.ipv4.conf.default.arp_notify = 0
net.ipv4.conf.lo.proxy_arp = 0
net.ipv4.conf.lo.arp_filter = 0
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.lo.arp_ignore = 2
net.ipv4.conf.lo.arp_accept = 0
net.ipv4.conf.lo.arp_notify = 0
net.ipv4.conf.eth0.proxy_arp = 0
net.ipv4.conf.eth0.arp_filter = 0
net.ipv4.conf.eth0.arp_announce = 2
net.ipv4.conf.eth0.arp_ignore = 2
net.ipv4.conf.eth0.arp_accept = 0
net.ipv4.conf.eth0.arp_notify = 0
net.ipv4.conf.eth1.proxy_arp = 0
net.ipv4.conf.eth1.arp_filter = 0
net.ipv4.conf.eth1.arp_announce = 2
net.ipv4.conf.eth1.arp_ignore = 2
net.ipv4.conf.eth1.arp_accept = 0
net.ipv4.conf.eth1.arp_notify = 0
net.ipv4.conf.eth2.proxy_arp = 0
net.ipv4.conf.eth2.arp_filter = 0
net.ipv4.conf.eth2.arp_announce = 2
net.ipv4.conf.eth2.arp_ignore = 2
net.ipv4.conf.eth2.arp_accept = 0
net.ipv4.conf.eth2.arp_notify = 0
net.ipv4.conf.eth3.proxy_arp = 0
net.ipv4.conf.eth3.arp_filter = 0
net.ipv4.conf.eth3.arp_announce = 2
net.ipv4.conf.eth3.arp_ignore = 2
net.ipv4.conf.eth3.arp_accept = 0
net.ipv4.conf.eth3.arp_notify = 0

The behaviour actually is exactly the same one would expect if arp_filter is 
enabled on the interfaces, but the flag is clearly set to 0. Also setting 
arp_ignore to 0 does not cause the expected arp flux problem, as replies are 
sent only from the first virtual interface (eth1). In a way, it looks like as 
there are policies enforced through arptables, but it seems the module is not 
loaded, nor the userspace utility is available on the SSVM.

Of course, changing the order in the route table as follows, ie putting eth2 
before eth1 for 192.168.0.0/16, solves the issue.

169.254.0.0/16 dev eth0  proto kernel  scope link  src 169.254.2.92
192.168.0.0/16 dev eth2  proto kernel  scope link  src 192.168.3.110
192.168.0.0/16 dev eth1  proto kernel  scope link  src 192.168.3.102
192.168.0.0/16 dev eth3  proto kernel  scope link  src 192.168.3.103
default via 192.168.0.1 dev eth2

Quite interestingly, after this change ARP requests to eth2 are honoured by the 
SSVM even after it is rebooted, and even if the relevant ARP cache entry in the 
gateway is removed. Of course, this is not the case when the SSVM is destroyed, 
as the new SSVM will have a different MAC address for every interface.

It is also interesting noting that in another setup, where we configured an 
advanced zone, this problem does not occur. Even the SSVM is deployed in an adv 
zone, the network configuration of the SSVM is very similar.

It would be great if you can provide some advice for debugging this issue, or 
share similar experiences.

Regards,
Salvatore

Reply via email to