The issue was related to SRIOV, but not sure why exactly. After disabling all VFs things started to work like normal again. I was using the PF as trunk for VM with OVS + vlan, weird is that it worked for some compute nodes.
The command I used to disable all VFs: echo '0' > /sys/class/net/eth3/device/sriov_numvfs On Tue, Sep 22, 2015 at 4:47 PM, Nasir Mahmood <nasir.mahm...@gmail.com> wrote: > try this > > http://2014.texaslinuxfest.org/sites/default/files/HopkinsPPTdeck.pdf > > On Tue, Sep 22, 2015 at 1:16 PM, Sam Stoelinga <sammiest...@gmail.com> > wrote: > >> Hi all, >> >> Would appreciate your networking expertise. >> >> I have a 7 node environment, 3 controllers which run also as network node >> and 4 compute nodes. There are 2 compute nodes which are behaving correctly >> both sending and receiving packets, but the other 2 compute nodes can sent >> broadcasts such as ARP request and DHCP requests but the responses is not >> received. Also note that previously this was working but out of nowhere >> this issue started happening. >> >> I've debugged this using tcpdump on both the compute node and the >> controller node. The VM's DHCP request gets successfully sent to the >> controller node and the controller node responds with a dhcp response. It >> also get's successfully sent out of eth3 (vlan trunk), but this dhcp >> response never arrives on the compute node. So then I tried to login via >> VNC and manually set the IP and then try pinging, the result was outgoing >> arp request seen by all nodes but the arp response was not received by the >> compute node. >> >> I've tried disabling hardware offloading as I thought this may be the nic >> discarding packets, but that didn't help. I have spent about a day >> debugging with tcpdump but running out of clueues. Anybody with a similar >> experience? It's weird that the node can sent out packets seen by other >> nodes but it can not receive the responses from other nodes. >> >> tcpdump on compute which isn't receiving packets >> tcpdump -i eth3 -nnNs 512 >> tcpdump: WARNING: eth3: no IPv4 address assigned >> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode >> listening on eth3, link-type EN10MB (Ethernet), capture size 512 bytes >> 08:02:02.232749 STP 802.1s, Rapid STP, CIST Flags [Learn, Forward, >> Agreement], length 102 >> 08:02:03.990588 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request >> from fa:16:3e:0d:26:a5 (this is a VM), length 300 >> 08:02:04.232797 STP 802.1s, Rapid STP, CIST Flags [Learn, Forward, >> Agreement], length 102 >> 08:02:11.393997 LLDP, length 318: HP >> 08:02:12.232627 STP 802.1s, Rapid STP, CIST Flags [Learn, Forward, >> Agreement], length 102 >> 08:02:13.118051 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request >> from fa:16:3e:0d:26:a5, length 300 >> 08:02:14.232638 STP 802.1s, Rapid STP, CIST Flags [Learn, Forward, >> Agreement], length 102 >> >> tcpdump on controller >> ip netns exec qdhcp-878e1f0a-abba-4637-8afd-2814a38136a5 tcpdump -i >> tape2f5fa09-19 -nnNs 512 >> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode >> listening on tape2f5fa09-19, link-type EN10MB (Ethernet), capture size >> 512 bytes >> 08:01:32.591906 ARP, Request who-has 192.168.111.122 tell 192.168.111.3, >> length 28 >> 08:01:33.033029 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request >> from fa:16:3e:0d:26:a5, length 300 >> 08:01:33.033327 IP 192.168.111.3.67 > 192.168.111.122.68: BOOTP/DHCP, >> Reply, length 328 >> 08:01:38.048136 ARP, Request who-has 192.168.111.122 tell 192.168.111.3, >> length 28 >> 08:01:39.051897 ARP, Request who-has 192.168.111.122 tell 192.168.111.3, >> length 28 >> 08:01:40.051901 ARP, Request who-has 192.168.111.122 tell 192.168.111.3, >> length 28 >> 08:01:45.168519 ARP, Request who-has 192.168.111.138 tell >> 192.168.111.138, length 46 >> 08:01:45.168546 ARP, Request who-has 192.168.111.138 tell >> 192.168.111.138, length 46 >> 08:01:46.962341 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request >> from fa:16:3e:0d:26:a5, length 300 >> 08:01:46.962584 IP 192.168.111.3.67 > 192.168.111.122.68: BOOTP/DHCP, >> Reply, length 328 >> 08:01:49.438756 ARP, Request who-has 192.168.111.138 tell >> 192.168.111.138, length 46 >> 08:01:49.438780 ARP, Request who-has 192.168.111.138 tell >> 192.168.111.138, length 46 >> 08:01:51.346561 ARP, Request who-has 192.168.111.112 tell >> 192.168.111.136, length 46 >> 08:01:51.967905 ARP, Request who-has 192.168.111.122 tell 192.168.111.3, >> length 28 >> >> Thanks, >> Sam Stoelinga >> >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack@lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> > > > -- > Nasir Mahmood >
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack