> -----Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie > Sent: Sunday, October 25, 2015 11:35 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v5 0/7] virtio ring layout optimization and simple > rx/tx processing > > Changes in v5: > - Call __rte_pktmbuf_prefree_seg to check refcnt when free mbufs > > Changes in v4: > - Fix the error in virtio tx ring layout ascii chart in the commit message > - Move virtio_xmit_cleanup ahead to free descriptors earlier > - Test merge-able feature when select simple rx/tx functions > > Changes in v3: > - Remove unnecessary NULL test for rte_free > - Remove unnecessary assign of local var after free > - Remove return at the end of void function > - Remove always_inline attribute for virtio_xmit_cleanup > - Reword some commit messages > - Add TODO in the commit message of simple tx patch > > Changes in v2: > - Remove the configure macro > - Enable simple R/TX processing when user specifies simple txq flags > - Reword some comments and commit messages > > In DPDK based switching enviroment, mostly vhost runs on a dedicated core > while virtio processing in guest VMs runs on other different cores. > Take RX for example, with generic implementation, for each guest buffer, > a) virtio driver allocates a descriptor from free descriptor list > b) modify the entry of avail ring to point to allocated descriptor > c) after packet is received, free the descriptor > > When vhost fetches the avail ring, it need to fetch the modified L1 cache > from virtio core, which is a heavy cost in current CPU implementation. > > This idea of this optimization is: > allocate the fixed descriptor for each entry of avail ring, so avail ring > will > always be the same during the run. > This removes L1M cache transfer from virtio core to vhost core for avail ring. > (Note we couldn't avoid the cache transfer for descriptors). > Besides, descriptor allocation and free operation is eliminated. > This also makes vector procesing possible to further accelerate the > processing. > > This is the layout for the avail ring(take 256 ring entries for example), with > each entry pointing to the descriptor with the same index. > avail > idx > + > | > +----+----+---+-------------+------+ > | 0 | 1 | 2 | ... | 254 | 255 | avail ring > +-+--+-+--+-+-+---------+---+--+---+ > | | | | | | > | | | | | | > v v v | v v > +-+--+-+--+-+-+---------+---+--+---+ > | 0 | 1 | 2 | ... | 254 | 255 | desc ring > +----+----+---+-------------+------+ > | > | > +----+----+---+-------------+------+ > | 0 | 1 | 2 | | 254 | 255 | used ring > +----+----+---+-------------+------+ > | > + > > This is the ring layout for TX. > As we need one virtio header for each xmit packet, we have 128 slots > available. > > ++ > || > || > +-----+-----+-----+--------------+------+------+------+ > | 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > | | | || | | | > v v v || v v v > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > | 127 | 128 | ... | 255 || 127 | 128 | ... | 255 | desc ring for > virtio_net_hdr > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > | | | || | | | > v v v || v v v > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > | 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat > +-----+-----+-----+--------------+------+------+------+ > || > || > ++ > > > Performance boost could be observed only if the virtio backend isn't the > bottleneck or in VM2VM case. > There are also several vhost optimization patches to be submitted later. > > > Huawei Xie (7): > virtio: add virtio_rxtx.h header file > virtio: add software rx ring, fake_buf into virtqueue > virtio: rx/tx ring layout optimization > virtio: fill RX avail ring with blank mbufs > virtio: virtio vec rx > virtio: simple tx routine > virtio: pick simple rx/tx func > > drivers/net/virtio/Makefile | 2 +- > drivers/net/virtio/virtio_ethdev.c | 12 +- > drivers/net/virtio/virtio_ethdev.h | 5 + > drivers/net/virtio/virtio_rxtx.c | 56 ++++- > drivers/net/virtio/virtio_rxtx.h | 39 +++ > drivers/net/virtio/virtio_rxtx_simple.c | 414 > ++++++++++++++++++++++++++++++++ > drivers/net/virtio/virtqueue.h | 5 + > 7 files changed, 529 insertions(+), 4 deletions(-) create mode 100644 > drivers/net/virtio/virtio_rxtx.h create mode 100644 > drivers/net/virtio/virtio_rxtx_simple.c > > -- > 1.8.1.4
Acked-by Jianfeng Tan <jianfeng.tan at intel.com> Thanks, Jianfeng