Leann, This looks like a kernel patch for your team to evaluate.
Thanks. Michael On 03/13/2017 02:49 PM, Launchpad Bug Tracker wrote: > bugproxy (bugproxy) has assigned this bug to you for Ubuntu: > > == Comment: #0 - Mauro Sergio Martins Rodrigues - 2017-02-22 06:48:42 == > While investigating bug #145959 I got blocked in the reproduction process due > to the follow issue during interface link bring up: > > [ 1.590591] i40e 0045:01:00.0: AQ command Config VSI BW allocation per TC > failed = 14 > [ 1.590661] i40e 0045:01:00.0: Failed configuring TC map 255 for VSI 399 > [ 1.590669] i40e 0045:01:00.0: failed to configure TCs for main VSI tc_map > 0x000000ff, err I40E_ERR_INVALID_QP_ID aq_err I40E_AQ_RC_EINVAL > > which prevented me to bring the interface up and associate an ip to it. > > == Comment: #2 - Mauro Sergio Martins Rodrigues - 2017-02-22 07:26:36 == > some missing Information kernel is Ubuntu's 4.4.0-62-generic. > > When testing with 4.8.0-36-generic (from xenial's proposed) device probe > works fine, no similar message is seen. > > To obtain some more data on this I added some statements to see which TC > MAP was applied in a healthy probe (note that the other functions, like > function 1 works fine but those functions have no cable on them). > > root@yangtze-lp1:~/_maurosr/linux-4.4.0/drivers/net/ethernet/intel/i40e# > dmesg > [52448.914605] i40e 0045:01:00.3: i40e_ptp_stop: removed PHC on enP69p1s0f3 > [52448.981801] i40e 0045:01:00.2: i40e_ptp_stop: removed PHC on enP69p1s0f2 > [52449.069793] i40e 0045:01:00.1: i40e_ptp_stop: removed PHC on enP69p1s0f1 > [52449.173834] i40e 0045:01:00.0: i40e_ptp_stop: removed PHC on enP69p1s0f0 > [52449.264462] i40e: Intel(R) Ethernet Connection XL710 Network Driver - > version 1.4.25-k > [52449.264468] i40e: Copyright (c) 2013 - 2014 Intel Corporation. > [52449.264625] i40e 0045:01:00.0: Using 64-bit DMA iommu bypass > [52449.286138] i40e 0045:01:00.0: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 > 0.0.0 > [52449.505657] i40e 0045:01:00.0: MAC address: 68:05:ca:2d:e9:08 > [52449.508977] i40e 0045:01:00.0: SAN MAC: 68:05:ca:2d:e9:0c > [52449.529200] i40e 0045:01:00.0: DEBUG DATA vsi > 399;enabled_tc > 255 > [52449.531210] i40e 0045:01:00.0: AQ command Config VSI BW allocation per TC > failed = 14 > [52449.531213] i40e 0045:01:00.0: Failed configuring TC map 255 for VSI 399 > [52449.531217] i40e 0045:01:00.0: failed to configure TCs for main VSI tc_map > 0x000000ff, err I40E_ERR_INVALID_QP_ID aq_err I40E_AQ_RC_EINVAL > [52449.544642] i40e 0045:01:00.0 enP69p1s0f0: renamed from eth0 > [52449.697424] i40e 0045:01:00.0: PCI-Express: Speed 8.0GT/s Width x8 > [52449.727043] i40e 0045:01:00.0: Features: PF-id[0] VFs: 32 VSIs: 34 QP: 0 > RX: 1BUF RSS FD_ATR DCB VxLAN Geneve PTP VEPA > [52449.727098] i40e 0045:01:00.1: Using 64-bit DMA iommu bypass > [52449.748667] i40e 0045:01:00.1: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 > 0.0.0 > [52449.976665] i40e 0045:01:00.1: MAC address: 68:05:ca:2d:e9:09 > [52449.980685] i40e 0045:01:00.1: SAN MAC: 68:05:ca:2d:e9:0d > [52449.994982] i40e 0045:01:00.1: DEBUG DATA vsi > 398;enabled_tc > 1 > [52450.015610] i40e 0045:01:00.1 enP69p1s0f1: renamed from eth0 > [52450.074479] i40e 0045:01:00.1: PCI-Express: Speed 8.0GT/s Width x8 > [52450.080516] i40e 0045:01:00.1: Features: PF-id[1] VFs: 32 VSIs: 34 QP: 128 > RX: 1BUF RSS FD_ATR DCB VxLAN Geneve PTP VEPA > > Comparing function 0: > [52449.529200] i40e 0045:01:00.0: DEBUG DATA vsi > 399;enabled_tc > 255 > and function 1: > [52449.994982] i40e 0045:01:00.1: DEBUG DATA vsi > 398;enabled_tc > 1 > > > Then looking at 4.8: > [ 123.425399] i40e: loading out-of-tree module taints kernel. > [ 123.428958] i40e: module verification failed: signature and/or required > key missing - tainting kernel > [ 123.430690] i40e: Intel(R) Ethernet Connection XL710 Network Driver - > version 1.6.11-k > [ 123.430691] i40e: Copyright (c) 2013 - 2014 Intel Corporation. > [ 123.430918] i40e 0045:01:00.0: Using 64-bit DMA iommu bypass > [ 123.450445] i40e 0045:01:00.0: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 > 0.0.0 > [ 123.664088] i40e 0045:01:00.0: MAC address: 68:05:ca:2d:e9:08 > [ 123.667878] i40e 0045:01:00.0: SAN MAC: 68:05:ca:2d:e9:0c > [ 123.681915] Non-contiguous TC - Disabling DCB > [ 123.690177] i40e 0045:01:00.0: DEBUG DATA vsi > 399, enabled_tc 1 > [ 123.713262] i40e 0045:01:00.0 enP69p1s0f0: renamed from eth0 > [ 123.864601] i40e 0045:01:00.0: Added LAN device PF0 bus=0x00 func=0x00 > [ 123.864611] i40e 0045:01:00.0: PCI-Express: Speed 8.0GT/s Width x8 > [ 123.893254] i40e 0045:01:00.0: Features: PF-id[0] VFs: 32 VSIs: 34 QP: 128 > RSS FD_ATR DCB VxLAN Geneve PTP VEPA > [ 123.893321] i40e 0045:01:00.1: Using 64-bit DMA iommu bypass > [ 123.914829] i40e 0045:01:00.1: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 > 0.0.0 > [ 124.152980] i40e 0045:01:00.1: MAC address: 68:05:ca:2d:e9:09 > [ 124.156999] i40e 0045:01:00.1: SAN MAC: 68:05:ca:2d:e9:0d > [ 124.171266] i40e 0045:01:00.1: DEBUG DATA vsi > 398, enabled_tc 1 > [ 124.196080] i40e 0045:01:00.1 enP69p1s0f1: renamed from eth0 > [ 124.253353] i40e 0045:01:00.1: Added LAN device PF1 bus=0x00 func=0x01 > [ 124.253387] i40e 0045:01:00.1: PCI-Express: Speed 8.0GT/s Width x8 > [ 124.263908] i40e 0045:01:00.1: Features: PF-id[1] VFs: 32 VSIs: 34 QP: 128 > RSS FD_ATR DCB VxLAN Geneve PTP VEPA > > > These 2 lines are important here: > [ 123.681915] Non-contiguous TC - Disabling DCB > [ 123.690177] i40e 0045:01:00.0: DEBUG DATA vsi > 399, enabled_tc 1 > > First it decided to disable DCB feature due to lack of contiguous > traffic classes, and then it used TC MAP (enabled_tc in device driver > code as 1, same we already knew works). With that information in hand I > forced enabled_tc (TC MAP) to 1 in 4.4's code and it worked, so I'm > suspecting of a bad TC mask due to DCB being enabled. > > == Comment: #3 - Mauro Sergio Martins Rodrigues - 2017-02-23 11:24:41 == > I tried the 4.4's version of the i40e but with dcbx disabled in switch's > port, Traffic class setup and function bring up worked fine! It user TC MAP > (or traffic class mask) as 1. I do understand that this is just a workaround > though, the device driver should deal with the case where the switch has such > feature enabled instead of leaving the device 'broken': > > [ 199.762738] i40e 0045:01:00.0: Using 64-bit DMA iommu bypass > [ 199.786589] i40e 0045:01:00.0: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 > 0.0.0 > [ 200.045270] i40e 0045:01:00.0: MAC address: 68:05:ca:2d:e9:08 > [ 200.048955] i40e 0045:01:00.0: SAN MAC: 68:05:ca:2d:e9:0c > [ 200.069228] i40e 0045:01:00.0: DEBUG DATA >> dcb not enabled - first if > [ 200.069232] i40e 0045:01:00.0: DEBUG DATA vsi > 399;enabled_tc > 1 > [ 200.088056] i40e 0045:01:00.0 enP69p1s0f0: renamed from eth0 > [ 200.240641] i40e 0045:01:00.0: PCI-Express: Speed 8.0GT/s Width x8 > [ 200.270717] i40e 0045:01:00.0: Features: PF-id[0] VFs: 32 VSIs: 34 QP: 128 > RX: 1BUF RSS FD_ATR DCB VxLAN Geneve PTP VEPA > > The line > [ 200.069228] i40e 0045:01:00.0: DEBUG DATA >> dcb not enabled - first if > corresponds to the piece of code where the traffic class is defined (see: > http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/i40e/i40e_main.c?v=4.4#L4563) > > Another interesting discovery is that the device behaves well when we > turn dcbx on in the switch after it's already probed: > > [ 609.566786] i40e 0045:01:00.0: DEBUG DATA >> dcb not enabled - first if > [ 609.566794] i40e 0045:01:00.0: DEBUG DATA >> dcb not enabled - first if > [ 611.574987] i40e 0045:01:00.0: DEBUG DATA >> SFP - second if > [ 611.574990] i40e 0045:01:00.0: DEBUG DATA >> SFP - second if > [ 611.574994] i40e 0045:01:00.0: DEBUG DATA vsi > 399;enabled_tc > 31 > > and such transition set traffic class mask as 31 instead of 255. and if > we unload/load the module it goes to the original bad state we > experienced in this bug again: > > [ 746.151068] i40e 0045:01:00.0: Using 64-bit DMA iommu bypass > [ 746.174695] i40e 0045:01:00.0: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 > 0.0.0 > [ 746.433649] i40e 0045:01:00.0: MAC address: 68:05:ca:2d:e9:08 > [ 746.437552] i40e 0045:01:00.0: SAN MAC: 68:05:ca:2d:e9:0c > [ 746.457815] i40e 0045:01:00.0: DEBUG DATA >> SFP - second if > [ 746.457819] i40e 0045:01:00.0: DEBUG DATA vsi > 399;enabled_tc > 255 > [ 746.459537] i40e 0045:01:00.0: AQ command Config VSI BW allocation per TC > failed = 14 > [ 746.459541] i40e 0045:01:00.0: Failed configuring TC map 255 for VSI 399 > [ 746.459550] i40e 0045:01:00.0: failed to configure TCs for main VSI tc_map > 0x000000ff, err I40E_ERR_INVALID_QP_ID aq_err I40E_AQ_RC_EINVAL > > == Comment: #4 - Mauro Sergio Martins Rodrigues - 2017-02-23 14:25:30 == > Things are going smoothly in kernel 4.8 even if dcbx is enabled in the port > due to this commit > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=fbfe12c > which disabledcbx when TC are not contiguous (it's not supported by the > device) > > We should ask for a backport into 4.4.0 but I'm still investigating to > see if something else should be included since in comment #3 we can see > it transitioning into a valid state when dcbx is enabled in the switch. > > == Comment: #5 - Mauro Sergio Martins Rodrigues - 2017-03-13 13:41:19 == > Even though it was already clear that was related to kernel code, since it > works on 4.8 and doesn't in 4.4 I decided to perform a nvm update and it > didn't change the scenario. > > comment #2 show nvm version as: >> [ 123.450445] i40e 0045:01:00.0: fw 5.0.40043 api 1.5 nvm 5.02 0x80002284 >> 0.0.0 > Current version is: > firmware-version: 5.05 0x8000289d 1.1568.0 > > and the issue continues reproducible . > > As stated in comment #4, now I can confirm we need to backport > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=fbfe12c > to 4.4 to avoid getting into the broken state when probing Intel x710 > (driver i40e). > > ** Affects: ubuntu > Importance: Undecided > Assignee: Taco Screen team (taco-screen-team) > Status: New > > > ** Tags: architecture-ppc64le bugnameltc-151930 severity-high > targetmilestone-inin--- -- Michael Hohnbaum OIL Program Manager Power (ppc64el) Development Project Manager Canonical, Ltd. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1672550 Title: i40e Intel X710 error during device probe prevents link set up and ip association To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672550/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs