Hi Don,
Thanks for all the explanations. It will be a great help.
Can you suggest some pointers to look more into PCIe bus
configurations/tuning ?
Rgds,
Nishit Shah.
On 5/2/2013 3:41 AM, Skidmore, Donald C wrote:
> Hey Nishit,
>
> I replied inline below.
>
> Thanks,
> -Don Skidmore<[email protected]>
>
>> -----Original Message-----
>> From: Nishit Shah [mailto:[email protected]]
>> Sent: Tuesday, April 30, 2013 11:46 PM
>> To: Skidmore, Donald C
>> Cc: [email protected]
>> Subject: Re: [E1000-devel] 82599 latency increase with rx_missed_errors
>>
>>
>> Hi Don,
>>
>> On 5/1/2013 3:40 AM, Skidmore, Donald C wrote:
>>> Hi Nishit,
>>>
>>> The rx_no_dma_resources means we are dropping packets because we
>> don't have any free descriptors in the RX Queue. While rx_missed_errors
>> are due to insufficient space to store an ingress packet, so basically we ran
>> out of buffers or bandwidth on the PCIe bus.
>>
>> Thanks for the explanation. Is there any way to verify whether we are
>> running out of buffers or bandwidth on the PCIe bus ?
> To a large extent these are transparent to the driver. You could put on FC
> and if you still see that same number of rx_missed_errors this would infer
> that you are overloading your bus.
>
>> (ixgbe loading shows PCIe 5.0GT/s:Width x8)
>>
>> # dmesg | grep "0000:06:00"
>> ixgbe 0000:06:00.0: PCI->APIC IRQ transform: INT C -> IRQ 18
>> ixgbe 0000:06:00.0: setting latency timer to 64
>> ixgbe 0000:06:00.0: irq 40 for MSI/MSI-X
>> ixgbe 0000:06:00.0: irq 41 for MSI/MSI-X
>> ixgbe 0000:06:00.0: irq 42 for MSI/MSI-X
>> ixgbe 0000:06:00.0: (PCI Express:5.0GT/s:Width x8)
>> 00:90:fb:45:f1:76
>> ixgbe 0000:06:00.0: eth0: MAC: 2, PHY: 14, SFP+: 5, PBA No:
>> ixgbe 0000:06:00.0: eth0: Enabled Features: RxQ: 2 TxQ: 2 FdirHash
>> RSS
>> ixgbe 0000:06:00.0: eth0: Intel(R) 10 Gigabit Network Connection
>> ixgbe 0000:06:00.1: PCI->APIC IRQ transform: INT D -> IRQ 19
>> ixgbe 0000:06:00.1: setting latency timer to 64
>> ixgbe 0000:06:00.1: irq 43 for MSI/MSI-X
>> ixgbe 0000:06:00.1: irq 44 for MSI/MSI-X
>> ixgbe 0000:06:00.1: irq 45 for MSI/MSI-X
>> ixgbe 0000:06:00.1: (PCI Express:5.0GT/s:Width x8)
>> 00:90:fb:45:f1:77
>> ixgbe 0000:06:00.1: eth1: MAC: 2, PHY: 14, SFP+: 6, PBA No:
>> ixgbe 0000:06:00.1: eth1: Enabled Features: RxQ: 2 TxQ: 2 FdirHash
>> RSS
>> ixgbe 0000:06:00.1: eth1: Intel(R) 10 Gigabit Network Connection
>>
>> One another interesting observation.
>> When we have changed the packet buffer from 512 KB to 128 KB (by
>> changing rx_pb_size in ixgbe_82599.c), per packet latency is reduced from
>> 500 microseconds to 100 microseconds for 64 bytes packets.
>> Does it mean some kind of relation with size of packet buffer ?
> This most likely shows that your small packet flow is overloading the PCIe
> bus. Since shrinking the packet buffer would me it would take less time
> backfill while you wait on the saturated PCIe bus. So when you shrink the
> packet buffer you don't wait as long queued up for PCIe bus that can't handle
> your load. So I'm wouldn't think this would be a solution for you the real
> issue here is that the bus can't keep up with small packets loads.
>
>>>
>>> All that said when you see the rx_no_dma_resources errors is their rate
>> comparable with what you were seeing for rx_missed_errors? Both will lead
>> to the same thing, dropped packets.
>>
>> I have find out the frame size from where we are getting
>> rx_missed_errors and below is a rate at those sizes.
>>
>> frame size 110 bytes (avg. latency 45 microseconds)
>> - no rx_missed_errors.
>> - rx_no_dma_resources increase rate is 8200000/sec
>>
>> frame size 108 bytes (avg. latency 345 microseconds)
>> - rx_missed_errors increase rate is 207000/sec
>> - rx_no_dma_resources increase rate is 8300000/sec
> I assume you crossed some boundary here with the lower frame size you have
> more PCI transaction for the same data throughput. So once you crossed to
> 108 bytes you start running out of PCIe bandwidth.
>
>>>
>>> Also what does 'lspci -vvv' show, I'm looking to see if you are getting the
>>> full
>> PCIe bandwidth. You could also try to turn on FC which should lower these
>> types of overflow occurrences.
>>
>> Enabling flow control is even increasing the latency. Seems to be
>> tester machine is not understanding the PAUSE frames and FC also clears
>> the DROP_EN bit that is again increasing the latency.
>> lspci -vvv output is attached with the mail.
>>
>>> Thanks,
>>> -Don Skidmore<[email protected]>
>> Rgds,
>> Nishit Shah.
>>
>>>> -----Original Message-----
>>>> From: Nishit Shah [mailto:[email protected]]
>>>> Sent: Tuesday, April 30, 2013 9:07 AM
>>>> To: [email protected]
>>>> Subject: [E1000-devel] 82599 latency increase with rx_missed_errors
>>>>
>>>>
>>>> Hi,
>>>>
>>>> We are measuring packet latencies at various packet sizes (64 bytes
>>>> to
>>>> 1518 bytes) with 82599 card with ixgbe driver 3.7.21.
>>>>
>>>> Setup:
>>>>
>>>> Spirent test center sender machine with 82599
>>>> (ixgbe 3.7.21 and vanilla 2.6.39.4) Spirent test center receiver
>>>>
>>>> 10 G<------------------------> 10G
>>>> 10G<------------------------------> 10G
>>>>
>>>> When we don't have an increase in "rx_missed_errors" and
>>>> "rx_no_dma_resources", we are getting per packet latency around 40-70
>>>> microseconds. ("rx_no_buffer_count" is not increasing)
>>>> When we have an increase in "rx_no_dma_resources", we are still
>> getting
>>>> per packet latency around 40-70 microseconds.
>>>> ("rx_no_buffer_count" is not increasing)
>>>> When we have an increase in "rx_missed_errors", we are getting per
>>>> packet latency around 500 microseconds. (rx_no_buffer_count is not
>>>> increasing)
>>>>
>>>> Is there any specific reason for latency increase when
>> "rx_missed_errors"
>>>> are increased ?
>>>> Is there a way to control it ?
>>>>
>>>> Below is a machine detail.
>>>>
>> ==========================================================
>>>> ===============================================
>>>> Machine details.
>>>>
>>>> CPU: Dual Core Intel(R) Celeron(R) CPU G540 @ 2.50GHz
>>>> Memory: 2 GB
>>>> kernel: vanilla 2.6.39.4
>>>> Interface tuning parameters:
>>>> Auto Negotiation is off (DROP_EN is set.)
>>>> ethtool -G eth0 rx 64 tx 128 ; ethtool -G eth1 rx 64 tx
>>>> 128
>>>> rx-usecs is set to 50.
>>>> ethtool and lspci for bus information:
>>>>
>>>> # ethtool -i eth0
>>>> driver: ixgbe
>>>> version: 3.7.21-NAPI
>>>> firmware-version: 0x80000345
>>>> bus-info: 0000:06:00.0
>>>> #
>>>> # ethtool -i eth1
>>>> driver: ixgbe
>>>> version: 3.7.21-NAPI
>>>> firmware-version: 0x80000345
>>>> bus-info: 0000:06:00.1
>>>>
>>>> 06:00.0 Class 0200: Device 8086:10fb (rev 01)
>>>> Subsystem: Device 15bb:30e0
>>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV-
>> VGASnoop-
>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>>>> >TAbort-<TAbort-<MAbort->SERR-<PERR- INTx-
>>>> Latency: 0, Cache Line Size: 64 bytes
>>>> Interrupt: pin A routed to IRQ 18
>>>> Region 0: Memory at f7520000 (64-bit, non-prefetchable)
>> [size=128K]
>>>> Region 2: I/O ports at 8020 [size=32]
>>>> Region 4: Memory at f7544000 (64-bit, non-prefetchable)
>> [size=16K]
>>>> Capabilities: [40] Power Management version 3
>>>> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>>>> PME(D0+,D1-,D2-,D3hot+,D3cold-)
>>>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>>>> Address: 0000000000000000 Data: 0000
>>>> Masking: 00000000 Pending: 00000000
>>>> Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
>>>> Vector table: BAR=4 offset=00000000
>>>> PBA: BAR=4 offset=00002000
>>>> Capabilities: [a0] Express (v2) Endpoint, MSI 00
>>>> DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency
>>>> L0s<512ns,
>> L1
>>>> <64us
>>>> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
>>>> DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
>>>> Unsupported-
>>>> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>>> FLReset-
>>>> MaxPayload 128 bytes, MaxReadReq 512 bytes
>>>> DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
>>>> AuxPwr- TransPend-
>>>> LnkCap: Port #1, Speed 5GT/s, Width x8, ASPM L0s,
>>>> Latency
>> L0<2us,
>>>> L1<32us
>>>> ClockPM- Surprise- LLActRep- BwNot-
>>>> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
>>>> CommClk-
>>>> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>>> LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+
>>>> DLActive- BWMgmt- ABWMgmt-
>>>> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
>>>> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>>>> LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
>>>> SpeedDis-,
>>>> Selectable De-emphasis: -6dB
>>>> Transmit Margin: Normal Operating Range,
>>>> EnterModifiedCompliance- ComplianceSOS-
>>>> Compliance De-emphasis: -6dB
>>>> LnkSta2: Current De-emphasis Level: -6dB
>>>> Capabilities: [e0] Vital Product Data
>>>> Unknown small resource type 06, will not decode more.
>>>> Kernel driver in use: ixgbe
>>>> Kernel modules: ixgbe
>>>>
>>>> 06:00.1 Class 0200: Device 8086:10fb (rev 01)
>>>> Subsystem: Device 15bb:30e0
>>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV-
>> VGASnoop-
>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>>>> >TAbort-<TAbort-<MAbort->SERR-<PERR- INTx-
>>>> Latency: 0, Cache Line Size: 64 bytes
>>>> Interrupt: pin B routed to IRQ 19
>>>> Region 0: Memory at f7500000 (64-bit, non-prefetchable)
>> [size=128K]
>>>> Region 2: I/O ports at 8000 [size=32]
>>>> Region 4: Memory at f7540000 (64-bit, non-prefetchable)
>> [size=16K]
>>>> Capabilities: [40] Power Management version 3
>>>> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>>>> PME(D0+,D1-,D2-,D3hot+,D3cold-)
>>>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>>>> Address: 0000000000000000 Data: 0000
>>>> Masking: 00000000 Pending: 00000000
>>>> Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
>>>> Vector table: BAR=4 offset=00000000
>>>> PBA: BAR=4 offset=00002000
>>>> Capabilities: [a0] Express (v2) Endpoint, MSI 00
>>>> DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency
>>>> L0s<512ns,
>> L1
>>>> <64us
>>>> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
>>>> DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
>>>> Unsupported-
>>>> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>>> FLReset-
>>>> MaxPayload 128 bytes, MaxReadReq 512 bytes
>>>> DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
>>>> AuxPwr- TransPend-
>>>> LnkCap: Port #1, Speed 5GT/s, Width x8, ASPM L0s,
>>>> Latency
>> L0<2us,
>>>> L1<32us
>>>> ClockPM- Surprise- LLActRep- BwNot-
>>>> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
>>>> CommClk-
>>>> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>>> LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+
>>>> DLActive- BWMgmt- ABWMgmt-
>>>> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
>>>> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>>>> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance-
>>>> SpeedDis-
>> ,
>>>> Selectable De-emphasis: -6dB
>>>> Transmit Margin: Normal Operating Range,
>>>> EnterModifiedCompliance- ComplianceSOS-
>>>> Compliance De-emphasis: -6dB
>>>> LnkSta2: Current De-emphasis Level: -6dB
>>>> Capabilities: [e0] Vital Product Data
>>>> Unknown small resource type 06, will not decode more.
>>>> Kernel driver in use: ixgbe
>>>> Kernel modules: ixgbe
>>>>
>> ==========================================================
>>>> ===============================================
>>>>
>>>> Rgds,
>>>> Nishit Shah.
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
>> Get
>>>> 100% visibility into your production application - at no cost.
>>>> Code-level diagnostics for performance bottlenecks with<2% overhead
>>>> Download for free and get started troubleshooting in minutes.
>>>> http://p.sf.net/sfu/appdyn_d2d_ap1
>>>> _______________________________________________
>>>> E1000-devel mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/e1000-devel
>>>> To learn more about Intel® Ethernet, visit
>>>> http://communities.intel.com/community/wired
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired