Hi Maxime, Do you have more comments about this set? If no, I think I could merge it shortly.
Thanks. --yliu On Mon, Sep 19, 2016 at 10:00:11PM -0400, Zhihong Wang wrote: > This patch set optimizes the vhost enqueue function. > > It implements the vhost logic from scratch into a single function designed > for high performance and good maintainability, and improves CPU efficiency > significantly by optimizing cache access, which means: > > * Higher maximum throughput can be achieved for fast frontends like DPDK > virtio pmd. > > * Better scalability can be achieved that each vhost core can support > more connections because it takes less cycles to handle each single > frontend. > > This patch set contains: > > 1. A Windows VM compatibility fix for vhost enqueue in 16.07 release. > > 2. A baseline patch to rewrite the vhost logic. > > 3. A series of optimization patches added upon the baseline. > > The main optimization techniques are: > > 1. Reorder code to reduce CPU pipeline stall cycles. > > 2. Batch update the used ring for better efficiency. > > 3. Prefetch descriptor to hide cache latency. > > 4. Remove useless volatile attribute to allow compiler optimization. > > Code reordering and batch used ring update bring most of the performance > improvements. > > In the existing code there're 2 callbacks for vhost enqueue: > > * virtio_dev_merge_rx for mrg_rxbuf turned on cases. > > * virtio_dev_rx for mrg_rxbuf turned off cases. > > The performance of the existing code is not optimal, especially when the > mrg_rxbuf feature turned on. Besides, having 2 callback paths increases > maintenance efforts. > > Also, there's a compatibility issue in the existing code which causes > Windows VM to hang when the mrg_rxbuf feature turned on. > > --- > Changes in v6: > > 1. Merge duplicated code. > > 2. Introduce a function for used ring write. > > 3. Add necessary comments. > > --- > Changes in v5: > > 1. Rebase to dpdk-next-virtio master. > > 2. Rename variables to keep consistent in naming style. > > 3. Small changes like return value adjustment and vertical alignment. > > 4. Add details in commit log. > > --- > Changes in v4: > > 1. Fix a Windows VM compatibility issue. > > 2. Free shadow used ring in the right place. > > 3. Add failure check for shadow used ring malloc. > > 4. Refactor the code for clearer logic. > > 5. Add PRINT_PACKET for debugging. > > --- > Changes in v3: > > 1. Remove unnecessary memset which causes frontend stall on SNB & IVB. > > 2. Rename variables to follow naming convention. > > 3. Rewrite enqueue and delete the obsolete in the same patch. > > --- > Changes in v2: > > 1. Split the big function into several small ones. > > 2. Use multiple patches to explain each optimization. > > 3. Add comments. > > Zhihong Wang (6): > vhost: fix windows vm hang > vhost: rewrite enqueue > vhost: remove useless volatile > vhost: add desc prefetch > vhost: batch update used ring > vhost: optimize cache access > > lib/librte_vhost/vhost.c | 20 +- > lib/librte_vhost/vhost.h | 6 +- > lib/librte_vhost/vhost_user.c | 31 ++- > lib/librte_vhost/virtio_net.c | 541 > ++++++++++++++---------------------------- > 4 files changed, 225 insertions(+), 373 deletions(-) > > -- > 2.7.4