> one bug at a time, please.

Absolutely! I just mentioned the "GRO implementation" because I wondered
if it might have been related. I should have googled up better on it
beforehand, that would have enlightened me that it wasn't.

I've tested the v4.4-wily kernel in the first link
(4.4.0-040400-generic), and it failed miserably directly after the
machine came online. I'm attaching a redacted syslog with relevant
messages in it. One thing you'll note is that the i40e driver (1.3.x)
complains that the firmware is too new, this might be a problem(?), but
there's also a message, just before the "TX driver issue detected":

i40e 0000:02:00.1: FD filter programming failed due to incorrect filter
parameters

See the attached file for more details.

We're currently running the second kernel v4.10,
(4.10.0-041000-generic), and it's running fine so far, but the machine
has only been up for 30 minutes, i'll let it run 24 hours, and report
back tomorrow, or as soon as status changes, if at all.

** Attachment added: "redacted_i40e_syslog.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+attachment/4974593/+files/redacted_i40e_syslog.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to