On 08/25/2015 08:33 PM, Ananyev, Konstantin wrote: > Hi Vlad, > >> -----Original Message----- >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] >> Sent: Thursday, August 20, 2015 10:07 AM >> To: Ananyev, Konstantin; Lu, Wenzhuo >> Cc: dev at dpdk.org >> Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 >> for all NICs but 82598 >> >> >> >> On 08/20/15 12:05, Vlad Zolotarov wrote: >>> >>> On 08/20/15 11:56, Vlad Zolotarov wrote: >>>> >>>> On 08/20/15 11:41, Ananyev, Konstantin wrote: >>>>> Hi Vlad, >>>>> >>>>>> -----Original Message----- >>>>>> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] >>>>>> Sent: Wednesday, August 19, 2015 11:03 AM >>>>>> To: Ananyev, Konstantin; Lu, Wenzhuo >>>>>> Cc: dev at dpdk.org >>>>>> Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh >>>>>> above 1 for all NICs but 82598 >>>>>> >>>>>> >>>>>> >>>>>> On 08/19/15 10:43, Ananyev, Konstantin wrote: >>>>>>> Hi Vlad, >>>>>>> Sorry for delay with review, I am OOO till next week. >>>>>>> Meanwhile, few questions/comments from me. >>>>>> Hi, Konstantin, long time no see... ;) >>>>>> >>>>>>>>>>>> This patch fixes the Tx hang we were constantly hitting with a >>>>>>>> seastar-based >>>>>>>>>>>> application on x540 NIC. >>>>>>>>>>> Could you help to share with us how to reproduce the tx hang >>>>>>>>>>> issue, >>>>>>>> with using >>>>>>>>>>> typical DPDK examples? >>>>>>>>>> Sorry. I'm not very familiar with the typical DPDK examples to >>>>>>>>>> help u >>>>>>>>>> here. However this is quite irrelevant since without this this >>>>>>>>>> patch >>>>>>>>>> ixgbe PMD obviously abuses the HW spec as has been explained >>>>>>>>>> above. >>>>>>>>>> >>>>>>>>>> We saw the issue when u stressed the xmit path with a lot of >>>>>>>>>> highly >>>>>>>>>> fragmented TCP frames (packets with up to 33 fragments with >>>>>>>>>> non-headers >>>>>>>>>> fragments as small as 4 bytes) with all offload features enabled. >>>>>>> Could you provide us with the pcap file to reproduce the issue? >>>>>> Well, the thing is it takes some time to reproduce it (a few >>>>>> minutes of >>>>>> heavy load) therefore a pcap would be quite large. >>>>> Probably you can upload it to some place, from which we will be able >>>>> to download it? >>>> I'll see what I can do but no promises... >>> On a second thought pcap file won't help u much since in order to >>> reproduce the issue u have to reproduce exactly the same structure of >>> clusters i give to HW and it's not what u see on wire in a TSO case. >> And not only in a TSO case... ;) > I understand that, but my thought was you can add some sort of TX callback > for the rte_eth_tx_burst() > into your code that would write the packet into pcap file and then re-run > your hang scenario. > I know that it means extra work for you - but I think it would be very > helpful if we would be able to reproduce your hang scenario: > - if HW guys would confirm that setting RS bit for every EOP packet is not > really required, > then we probably have to look at what else can cause it. > - it might be added to our validation cycle, to prevent hitting similar > problem in future. > Thanks > Konstantin >
I think if you send packets with random fragment chains up to 32 mbufs you might see this. TSO was not required to trigger this problem.