Hi Yuanhan, On 10/14/2016 11:34 AM, Yuanhan Liu wrote: > This is a new set of patches to optimize the mergeable Rx code path. > No refactoring (rewrite) was made this time. It just applies some > findings from Zhihong (kudos to him!) that could improve the mergeable > Rx path on the old code. > > The two major factors that could improve the performance greatly are: > > - copy virtio header together with packet data. This could remove > the buubbles between the two copy to optimize the cache access. > > This is implemented in patch 2 "vhost: optimize cache access" > > - shadow used ring update and update them at once > > The basic idea is to update used ring in a local buffer and flush > them to the virtio used ring at once in the end. Again, this is > for optimizing the cache access. > > This is implemented in patch 5 "vhost: shadow used ring update" > > The two optimizations could yield 40+% performance in micro testing > and 20+% in PVP case testing with 64B packet size. > > Besides that, there are some tiny optimizations, such as prefetch > avail ring (patch 6) and retrieve avail head once (patch 7). > > Note: the shadow used ring tech could also be applied to the non-mrg > Rx path (and even the dequeu) path. I didn't do that for two reasons: > > - we already update used ring in batch in both path: it's not shadowed > first though. > > - it's a bit too late too make many changes at this stage: RC1 is out. > > Please help testing.
I tested the following use-cases without noticing any functional problems: - Windows Guests (mergeable ON & OFF, indirect disabled): ping other VM - Linux guests with Kernel driver (mergeable ON & OFF, indirect OFF): iperf between 2 VMs - Linux guest with Virtio PMD (mergeable ON & OFF): testpmd txonly on host, rxonly on guest. Feel free to add my: Tested-by: Maxime Coquelin <maxime.coquelin at redhat.com> Thanks, Maxime