> Le 22 nov. 2016 ? 14:17, Maxime Coquelin <maxime.coquelin at redhat.com> a > ?crit : > > Hi Pierre, > > On 11/22/2016 10:54 AM, Pierre Pfister (ppfister) wrote: >> Hello Maxime, >> >>> Le 9 nov. 2016 ? 15:51, Maxime Coquelin <maxime.coquelin at redhat.com> a >>> ?crit : >>> >>> Hi Pierre, >>> >>> On 11/09/2016 01:42 PM, Pierre Pfister (ppfister) wrote: >>>> Hello Maxime, >>>> >>>> Sorry for the late reply. >>>> >>>> >>>>> Le 8 nov. 2016 ? 10:44, Maxime Coquelin <maxime.coquelin at redhat.com> a >>>>> ?crit : >>>>> >>>>> Hi Pierre, >>>>> >>>>> On 11/08/2016 10:31 AM, Pierre Pfister (ppfister) wrote: >>>>>> Current virtio driver advertises VERSION_1 support, >>>>>> but does not handle device's VERSION_1 support when >>>>>> sending packets (it looks for ANY_LAYOUT feature, >>>>>> which is absent). >>>>>> >>>>>> This patch enables 'can_push' in tx path when VERSION_1 >>>>>> is advertised by the device. >>>>>> >>>>>> This significantly improves small packets forwarding rate >>>>>> towards devices advertising VERSION_1 feature. >>>>> I think it depends whether offloading is enabled or not. >>>>> If no offloading enabled, I measured significant drop. >>>>> Indeed, when no offloading is enabled, the Tx path in Virtio >>>>> does not access the virtio header before your patch, as the header is >>>>> memset to zero at device init time. >>>>> With your patch, it gets memset to zero at every transmit in the hot >>>>> path. >>>> >>>> Right. On the virtio side that is true, but on the device side, we have to >>>> access the header anyway. >>> No more now, if no offload features have been negotiated. >>> I have done a patch that landed in v16.11 to skip header parsing in >>> this case. >>> That said, we still have to access its descriptor. >>> >>>> And accessing two descriptors (with the address resolution and memory >>>> fetch which comes with it) >>>> is a costy operation compared to a single one. >>>> In the case indirect descriptors are used, this is 1 desc access instead >>>> or 3. >>> I agree this is far from being optimal. >>> >>>> And in the case chained descriptors are used, this doubles the number of >>>> packets that you can put in your queue. >>>> >>>> Those are the results in my PHY -> VM (testpmd) -> PHY setup >>>> Traffic is flowing bidirectionally. Numbers are for lossless-rates. >>>> >>>> When chained buffers are used for dpdk's TX: 2x2.13Mpps >>>> When indirect descriptors are used for dpdk's TX: 2x2.38Mpps >>>> When shallow buffers are used for dpdk's TX (with this patch): 2x2.42Mpps >>> When I tried it, I also did PVP 0% benchmark, and I got opposite results. >>> Chained and indirect cases were significantly better. >>> >>> My PVP setup was using a single NIC and single Virtio PMD, and NIC2VM >>> forwarding was IO mode done with testpmd on host, and Rx->Tx forwarding >>> was macswap mode on guest side. >>> >>> I also saw some perf regression when running simple tespmd test on both >>> ends. >>> >>> Yuanhan, did you run some benchmark with your series enabling >>> ANY_LAYOUT? >> >> It was enabled. But the specs specify that VERSION_1 includes ANY_LAYOUT. >> Therefor, Qemu removes ANY_LAYOUT when VERSION_1 is set. >> >> We can keep arguing about which is fastest. I guess we have different setups >> and different results, so we probably are deadlocked here. >> But in any case, the current code is inconsistent, as it uses single >> descriptor when ANY_LAYOUT is set, but not when VERSION_1 is set. >> >> I believe it makes sense to use single-descriptor any time it is possible, >> but you are free to think otherwise. >> Please make a call and make the code consistent (removes single-descriptors >> all together, or use them when VERSION_1 is set too). Otherwise it just >> creates yet-another testing headache. > I also think it makes sense to have a single descriptor, but I had to > highlight that I noticed a significant performance degradation when > using single descriptor on my setup. > > I'm fine we take your patch in virtio-next, so that more testing is conducted. >
Thanks, I just realised there was an indentation error in the patch. Meaning that this patch didn't make it to 16.11 ... I will send a new version. - Pierre > Thanks, > Maxime > >> >> Thanks, >> >> - Pierre >> >>> >>>> >>>> I must also note that qemu 2.5 does not seem to deal with VERSION_1 and >>>> ANY_LAYOUT correctly. >>>> The patch I am proposing here works for qemu 2.7, but with qemu 2.5, >>>> testpmd still behaves as if ANY_LAYOUT (or VERSION_1) was not available. >>>> This is not catastrophic. But just note that you will not see performance >>>> in some cases with qemu 2.5. >>> >>> Thanks for the info. >>> >>> Regards, >>> Maxime >>