On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote: > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehn...@tehnerd.com> > wrote: > > > > we are getting such errors: > > > > [ 408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP) > > Tx Queue <46> > > TDH, TDT <0>, <2> > > next_to_use <2> > > next_to_clean <0> > > tx_buffer_info[next_to_clean] > > time_stamp <0> > > jiffies <1000197c0> > > [ 408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, > > resetting adapter > > [ 408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout > > [ 408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter > > [ 408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more > > queues not cleared within the polling period > > [ 409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3 > > [ 409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow > > Control: RX/TX > > > > while running XDP prog on ixgbe nic. > > right now i'm seing this on bpfnext kernel > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ; > > 9a76aba02a37718242d7cdc294f0a3901928aa57) > > > > looks like this is the same issue as reported by Brenden in > > https://www.spinics.net/lists/netdev/msg439438.html > > > > -- > > Nikita V. Shirokov > > Could you provide some additional information about your setup. > Specifically useful would be "ethtool -i", "ethtool -l", and lspci > -vvv info for your device. The total number of CPUs on the system > would be useful to know as well. In addition could you try > reproducing sure:
ethtool -l eth0 Channel parameters for eth0: Pre-set maximums: RX: 0 TX: 0 Other: 1 Combined: 63 Current hardware settings: RX: 0 TX: 0 Other: 1 Combined: 48 # ethtool -i eth0 driver: ixgbe version: 5.1.0-k firmware-version: 0x800006f1 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes # nproc 48 lspci: 03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) Subsystem: Intel Corporation Device 000d Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 30 NUMA node: 0 Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M] Region 2: I/O ports at 6000 [size=32] Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K] Expansion ROM at c7e00000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] MSI-X: Enable+ Count=64 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00002000 Capabilities: [a0] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+ LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 <8us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60 Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ IOVSta: Migration- Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00 VF offset: 128, stride: 2, Device ID: 10ed Supported Page Size: 00000553, System Page Size: 00000001 Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable) Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 Kernel driver in use: ixgbe workaround for now is to do the same, as Brenden did in his original finding: make sure that combined + xdp queues < max_tx_queues (e.g. w/ combined == 14 the issue goes away). > the issue with one of the sample XDP programs provided with the kernel > such as the xdp2 which I believe uses the XDP_TX function. We need to > try and create a similar setup in our own environment for > reproduction and debugging. will try but this could take a while, because i'm not sure that we have ixgbe in our test lab (and it would be hard to run such test in prod) > > Thanks. > > - Alex -- Nikita V. Shirokov