> -----Original Message-----
> From: Tantilov, Emil S [mailto:[email protected]]
> Sent: Tuesday, July 02, 2013 12:47 PM
> To: Haller, John H (John); [email protected]
> Subject: RE: [E1000-devel] ixgbe: BUG changing MTU or LRO setting disables VF
> multicast reception
>
> >-----Original Message-----
> >From: Haller, John H (John) [mailto:[email protected]]
> >Sent: Saturday, June 29, 2013 8:13 AM
> >To: [email protected]
> >Subject: [E1000-devel] ixgbe: BUG changing MTU or LRO setting disables
> >VF multicast reception
> >
> >Any change to the device which causes ixgbe_do_reset to be called,
> >including things like changing LRO setting or changing the MTU, cause
> >the multicast table to be disabled. This prevents multicast packets to
> >be sent to VFs, which also disables IPv6 to the VFs.
> >
> >As a quick hack, I added the following call to the end of
> >ixgbe_set_vf_multicast:
> > IXGBE_WRITE_REG(hw, IXGBE_MCSTCTRL,IXGBE_MCSTCTRL_MFE |
> > hw->mac.mc_filter_type);
>
> Thanks for the report. We were not able to reproduce the issue in house using
> the current driver, could you please provide some more details such as steps
> to
> reproduce and kernel/driver version?
>
> Emil
It took a while to get the lab set up again.
On the host, Red Hat 2.6.32-358.11.1.el6.x86_64 kernel, now using ixgbe driver
3.16.1
On the VM, Red Hat 2.6.32-358.2.1.el6.x86_64 kernel, using ixgbevf 2.8.7
The way LRO is being disabled is by configuring bridges. While this is not the
final plan,
since PXEboot doesn't support booting over ixgbevf, we are using the bridges for
network booting until we can change the need for network boot.
The IPv6 address is added to the destination VM manually after it comes up by
this command:
ip -6 addr add 2500:0:0:339::42:10/64 dev eth4
On the host,
eth0 is configured this way (ifcfg-eth0):
DEVICE=eth0
TYPE=Ethernet
NM_CONTROLLED=no
ONBOOT=yes
VLAN=yes
IPADDR=10.51.40.96
NETMASK=255.255.240.0
There are a large number of VLAN interfaces and bridges configured on the host.
The VLANs are used to direct different connections to different ports on a
connected
Ethernet switch (in this case, a HP 6120XG in a HP c7000 chassis).
Here is the directory content of /etc/sysconfig/network-scripts:
ifcfg-br10 ifcfg-br9 ifcfg-eth0.730 ifcfg-eth1.971
ifcfg-br1000 ifcfg-br910 ifcfg-eth0.740 ifcfg-eth1.981
ifcfg-br1001 ifcfg-br911 ifcfg-eth0.910 ifcfg-eth1.991
ifcfg-br1010 ifcfg-br920 ifcfg-eth0.920 ifcfg-lo
ifcfg-br1011 ifcfg-br921 ifcfg-eth0.930 ifdown
ifcfg-br1020 ifcfg-br930 ifcfg-eth0.940 ifdown-bnep
ifcfg-br1021 ifcfg-br931 ifcfg-eth0.950 ifdown-eth
ifcfg-br1030 ifcfg-br940 ifcfg-eth0.960 ifdown-ippp
ifcfg-br1031 ifcfg-br941 ifcfg-eth0.970 ifdown-ipv6
ifcfg-br1040 ifcfg-br950 ifcfg-eth0.980 ifdown-isdn
ifcfg-br1041 ifcfg-br951 ifcfg-eth0.990 ifdown-post
ifcfg-br1050 ifcfg-br960 ifcfg-eth1 ifdown-ppp
ifcfg-br1051 ifcfg-br961 ifcfg-eth1:0 ifdown-routes
ifcfg-br1060 ifcfg-br970 ifcfg-eth1.1001 ifdown-sit
ifcfg-br1061 ifcfg-br971 ifcfg-eth1.1011 ifdown-tunnel
ifcfg-br11 ifcfg-br980 ifcfg-eth1.1021 ifup
ifcfg-br12 ifcfg-br981 ifcfg-eth1.1031 ifup-aliases
ifcfg-br13 ifcfg-br990 ifcfg-eth1.1041 ifup-bnep
ifcfg-br14 ifcfg-br991 ifcfg-eth1.1051 ifup-eth
ifcfg-br15 ifcfg-eth0 ifcfg-eth1.1061 ifup-ippp
ifcfg-br16 ifcfg-eth0:0 ifcfg-eth1.601 ifup-ipv6
ifcfg-br17 ifcfg-eth0.1000 ifcfg-eth1.611 ifup-isdn
ifcfg-br18 ifcfg-eth0.1010 ifcfg-eth1.621 ifup-plip
ifcfg-br19 ifcfg-eth0.1020 ifcfg-eth1.631 ifup-plusb
ifcfg-br2 ifcfg-eth0.1030 ifcfg-eth1.641 ifup-post
ifcfg-br20 ifcfg-eth0.1040 ifcfg-eth1.651 ifup-ppp
ifcfg-br2:0 ifcfg-eth0.1050 ifcfg-eth1.661 ifup-routes
ifcfg-br21 ifcfg-eth0.1060 ifcfg-eth1.701 ifup-sit
ifcfg-br22 ifcfg-eth0.600 ifcfg-eth1.711 ifup-tunnel
ifcfg-br23 ifcfg-eth0.610 ifcfg-eth1.721 ifup-wireless
ifcfg-br24 ifcfg-eth0.620 ifcfg-eth1.731 init.ipv6-global
ifcfg-br25 ifcfg-eth0.630 ifcfg-eth1.741 ipm-notify
ifcfg-br3 ifcfg-eth0.640 ifcfg-eth1.911 net.hotplug
ifcfg-br4 ifcfg-eth0.650 ifcfg-eth1.921 network-functions
ifcfg-br5 ifcfg-eth0.660 ifcfg-eth1.931 network-functions-ipv6
ifcfg-br6 ifcfg-eth0.700 ifcfg-eth1.941 route-eth0
ifcfg-br7 ifcfg-eth0.710 ifcfg-eth1.951
ifcfg-br8 ifcfg-eth0.720 ifcfg-eth1.961
The bridges and VLAN configurations are all essentially identical, here is an
example:
ifcfg-eth0.910:
DEVICE=eth0.910
TYPE=Ethernet
BRIDGE=br910
NM_CONTROLLED=no
VLAN=yes
ONBOOT=yes
The corresponding bridge in ifcfg-br910 has this:
DEVICE=br910
TYPE=Bridge
NM_CONTROLLED=no
ONBOOT=yes
When the bridges are configured, LRO is turned off by the kernel. This is the
first time ixgbe_do_reset is called.
In addition, because QinQ VLAN tagging is used on the VLANs, the MTU
on eth0 and the bridges are changed to 1508. This causes the second call to
ixgbe_do_reset. After this, the VLAN configured on the SRIOV port has to be
reestablished, not sure why the libvirt vlan doesn't work. This is done after
the VM is created:
ip link set eth0 vf 0 vlan 610
ixgbe_do_reset eventually calls ixgbe_init_rx_addrs_generic, which has this
call:
/* Clear the MTA */
hw->addr_ctrl.mta_in_use = 0;
IXGBE_WRITE_REG(hw, IXGBE_MCSTCTRL, hw->mac.mc_filter_type);
The above doesn't actually clear the MTA, but only clears the MFE bit, see
below.
Unfortunately, MFE was originally set by a call to
ixgbe_update_mc_addr_list_generic,
which sets MFE if there is more than one mta in use, after MFE was set for the
original IPv6 link-local multicast MAC addresses, it's never set after
ixgbe_reset.
I verified this by inserting a call to dump_stack where the MFE was set or
cleared.
Here is the code which sets MFE in ixgbe_update_mc_addr_list_generic:
if (hw->addr_ctrl.mta_in_use > 0)
IXGBE_WRITE_REG(hw, IXGBE_MCSTCTRL,
IXGBE_MCSTCTRL_MFE | hw->mac.mc_filter_type);
The VF use of multicast (triggered by a VF configuring a multicast MAC
address after an IPv6 address is configured) never modifies MTA bit in MCSTCTRL.
While ixgbe_set_vf_multicasts and ixgbe_restore_vf_multicasts both
modify the MTA table itself, neither sets the MFE bit.
As mentioned in the original email, by adding code to set the MFE bit in
ixgbe_set_vf_multicasts caused ping6 to 2500:0:0:339::42:10 to start working.
My concern with my hack is that both ixgbe_sriov.c and ixgbe_common.c are
both modifying the MTA table, but neither section of code seems to be
concerned with what the other section may have originally done with the
MTA table, but there may be deeper aspects to how the driver works when
in SRIOV mode that I am missing, as the configuration associated with eth0
appears to be in the MTA table via the second VF(1).
Regards,
John Haller
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired