Buffer vhost enqueue shadow ring update, flush shadow ring until
buffered descriptors number exceed one batch. Thus virtio can receive
packets at a faster frequency.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index e50e137ca..18a207fc6 100644
Add batch dequeue function like enqueue function for packed ring, batch
dequeue function will not support chained descritpors, single packet
dequeue function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index e241436c7..e50e137ca
: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 7bf9ff9b7..f62e9ec3f 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -42,6 +42,8 @@
#define PACKED_RX_USED_FLAG (0ULL | VRING_DESC_F_AVAIL | VRING_DESC_F_USED
Optimize vhost device Tx datapath by separate functions. Packets can be
filled into one descriptor will be handled by batch and others will be
handled one by one as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 1b0fa2c64
Flush used flags when batched enqueue function is finished. Descriptor's
flags are pre-calculated as they will be reset by vhost.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 18a207fc6..7bf9ff9b7 100644
--- a/lib/librte_vhost/vhost.h
+++
Vhost enqueue descriptors are updated by batch number, while vhost
dequeue descriptors are buffered. Meanwhile in dequeue function only
first descriptor is buffered. Due to these differences, split vhost
enqueue and dequeue flush functions.
Signed-off-by: Marvin Liu
diff --git a/lib
Optimize vhost zero copy dequeue path like normal dequeue path.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5f2822ba2..deb9d0e39 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1881,6 +1881,141
Disable software pre-fetch actions on Skylake and later platforms.
Hardware can fetch needed data for vhost, additional software pre-fetch
will impact performance.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 30839a001..5f3b42e56 100644
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 046e497c2..6f28082bc 100644
--- a/lib/librte_vhost/virtio_net.c
Optimize vhost device Rx datapath by separate functions. No-chained
and direct descriptors will be handled by batch and other will be
handled one by one as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index deb9d0e39..56c2080fb
iated
Marvin Liu (13):
vhost: add packed ring indexes increasing function
vhost: add packed ring single enqueue
vhost: try to unroll for each loop
vhost: add packed ring batch enqueue
vhost: add packed ring single dequeue
vhost: add packed ring batch dequeue
vhost: flush enqueue updat
When vhost doing [de]nqueue, vq's local variable last_[used/avail]_idx
will be inceased. Adding inline functions can avoid duplicated codes.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 5131a97a3..22a3ddc38 100644
--- a/lib/librte_
Add vhost enqueue function for single packet and meanwhile left space
for flush used ring function.
Signed-off-by: Marvin Liu
Reviewed-by: Maxime Coquelin
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 42b662080..142c14e04 100644
--- a/lib/librte_vhost
Create macro for adding unroll pragma before for each loop. Batch
functions will be contained of several small loops which can be
optimized by compilers' loop unrolling pragma.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 8623
Batch enqueue function will first check whether descriptors are cache
aligned. It will also check prerequisites in the beginning. Batch
enqueue function do not support chained mbufs, single packet enqueue
function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost
Add batch dequeue function like enqueue function for packed ring, batch
dequeue function will not support chained descritpors, single packet
dequeue function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 18d01cb19..96bf763b1
Optimize vhost device packed ring enqueue function by splitting batch
and single functions. Packets can be filled into one desc will be
handled by batch and others will be handled by single as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost
Add vhost single packet dequeue function for packed ring and meanwhile
left space for shadow used ring update function.
Signed-off-by: Marvin Liu
Reviewed-by: Maxime Coquelin
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index a8130dc06..e1b06c1ce 100644
--- a/lib
Add vhost packed ring zero copy batch and single dequeue functions like
normal dequeue path.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5cdca9a7f..01d1603e3 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost
Flush used flags when batched enqueue function is finished. Descriptor's
flags are pre-calculated as they will be reset by vhost.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index a60b88d89..bf3c30f43 100644
--- a/lib/librte_vhost/vhost.h
+++
Optimize vhost device packed ring dequeue function by splitting batch
and single functions. No-chained and direct descriptors will be handled
by batch and other will be handled by single as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost
Buffer used ring updates as many as possible in vhost dequeue function
for coordinating with virtio driver. For supporting buffer, shadow used
ring element should contain descriptor's flags. First shadowed ring
index was recorded for calculating buffered number.
Signed-off-by: Marvin Liu
Buffer vhost enqueue shadowed ring flush action buffered number exceed
one batch. Thus virtio can receive packets at a faster frequency.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 96bf763b1..a60b88d89 100644
--- a/lib/librte_vhost/vhost.h
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 85ccc02da..88632caff 100644
--- a/lib/librte_vhost/virtio_net.c
used ring update when in_order negotiated
Marvin Liu (13):
vhost: add packed ring indexes increasing function
vhost: add packed ring single enqueue
vhost: try to unroll for each loop
vhost: add packed ring batch enqueue
vhost: add packed ring single dequeue
vhost: add packed ring batch de
Add vhost enqueue function for single packet and meanwhile left space
for flush used ring function.
Signed-off-by: Marvin Liu
Reviewed-by: Maxime Coquelin
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 42b662080..142c14e04 100644
--- a/lib/librte_vhost
Create macro for adding unroll pragma before for each loop. Batch
functions will be contained of several small loops which can be
optimized by compilers' loop unrolling pragma.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 8623
When vhost doing [de]nqueue, vq's local variable last_[used/avail]_idx
will be inceased. Adding inline functions can avoid duplicated codes.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 5131a97a3..22a3ddc38 100644
--- a/lib/librte_
Add vhost single packet dequeue function for packed ring and meanwhile
left space for shadow used ring update function.
Signed-off-by: Marvin Liu
Reviewed-by: Maxime Coquelin
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index a8130dc06..e1b06c1ce 100644
--- a/lib
Batch enqueue function will first check whether descriptors are cache
aligned. It will also check prerequisites in the beginning. Batch
enqueue function do not support chained mbufs, single packet enqueue
function will handle it.
Signed-off-by: Marvin Liu
Reviewed-by: Maxime Coquelin
diff
Add batch dequeue function like enqueue function for packed ring, batch
dequeue function will not support chained descritpors, single packet
dequeue function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 18d01cb19..96bf763b1
Buffer vhost enqueue shadowed ring flush action buffered number exceed
one batch. Thus virtio can receive packets at a faster frequency.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 96bf763b1..a60b88d89 100644
--- a/lib/librte_vhost/vhost.h
Buffer used ring updates as many as possible in vhost dequeue function
for coordinating with virtio driver. For supporting buffer, shadow used
ring element should contain descriptor's flags. First shadowed ring
index was recorded for calculating buffered number.
Signed-off-by: Marvin Liu
Flush used elements when batched enqueue function is finished.
Descriptor's flags are pre-calculated as they will be reset by vhost.
Signed-off-by: Marvin Liu
Reviewed-by: Gavin Hu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index a60b88d89..bf3c30f43 100644
---
Add vhost packed ring zero copy batch and single dequeue functions like
normal dequeue path.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5cdca9a7f..01d1603e3 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost
Optimize vhost device packed ring enqueue function by splitting batch
and single functions. Packets can be filled into one desc will be
handled by batch and others will be handled by single as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost
Optimize vhost device packed ring dequeue function by splitting batch
and single functions. No-chained and direct descriptors will be handled
by batch and other will be handled by single as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 7c5b4..93ebdd7b6 100644
--- a/lib/librte_vhost/virtio_net.c
Check whether freed descriptors are enough before enqueue operation.
If more space is needed, will try to cleanup used ring on demand. It
can give more chances to cleanup used ring, thus help RFC2544 perf.
Signed-off-by: Marvin Liu
---
drivers/net/virtio/virtio_rxtx.c | 73
When doing xmit in-order enqueue, packets are buffered and then flushed
into avail ring. It has possibility that no free room in avail ring,
thus some buffered packets can't be transmitted. So move stats update
just after successful avail ring updates.
Signed-off-by: Marvin Liu
---
driver
Add vhost enqueue function for single packet and meanwhile left space
for flush used ring function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5b85b832d..5ad0a8175 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost
dd vhost single packet dequeue function for packed ring and meanwhile
left space for shadow used ring update function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 51ed20543..454e8b33e 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b
Add burst dequeue function like enqueue function for packed ring, burst
dequeue function will not support chained descritpors, single packet
dequeue function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index ed8b4aabf..b33f29ba0
Burst enqueue function will first check whether descriptors are cache
aligned. It will also check prerequisites in the beginning. Burst
enqueue function not support chained mbufs, single packet enqueue
function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b
Simplify flush shadow used ring function names as all shadow rings are
reflect to used rings. No need to emphasize ring type.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index f34df3733..7116c389d 100644
--- a/lib/librte_vhost
.
Disable sofware prefetch is hardware can do better.
After all these methods done, single core vhost PvP performance with 64B
packet on Xeon 8180 can boost 40%.
Marvin Liu (14):
vhost: add single packet enqueue function
vhost: add burst enqueue function for packed ring
vhost: add single
Vhost enqueue descriptors are updated by burst number, while vhost
dequeue descriptors are buffered. Meanwhile in dequeue function only
first descriptor is buffered. Due to these differences, split vhost
enqueue and dequeue flush functions.
Signed-off-by: Marvin Liu
diff --git a/lib
: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 5471acaf7..b161082ca 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -42,6 +42,8 @@
#define VIRTIO_RX_USED_FLAG (0ULL | VRING_DESC_F_AVAIL | VRING_DESC_F_USED
Flush used flags when burst enqueue function is finished. Descriptor's
flags are pre-calculated as them will be reset by vhost.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 86552cbeb..5471acaf7 100644
--- a/lib/librte_vhost/vhost.h
+++
Buffer vhost enqueue shadow ring update, flush shadow ring until
buffered descriptors number exceed one burst. Thus virtio can receive
packets at a faster frequency.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index b33f29ba0..86552cbeb 100644
Cache address translation result and use it in next translation. Due
to limited regions are supported, buffers are most likely in same
region when doing data transmission.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index 7fb172912
Optimize vhost device rx function by separate descriptors, no-chained
and direct descriptors will be handled by burst and other will be
handled one by one as before. Pre-fetch descriptors in next two cache
lines as hardware will load two cache line data automatically.
Signed-off-by: Marvin Liu
Optimize vhost zero copy dequeue path like normal dequeue path.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 269ec8a43..8032229a0 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1979,6 +1979,108
Disable software pre-fetch actions on Skylake and Cascadelake platforms.
Hardware can fetch needed data for vhost, additional software pre-fetch
will have impact on performance.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 8623e91c0
Optimize vhost device tx function like rx function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 8032229a0..554617292 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -302,17 +302,6
Check whether space are enough before burst enqueue operation. If more
space is needed, will try to cleanup used descriptors for space on
demand. It can give more chances to free used descriptors, thus will
help RFC2544 performance.
Signed-off-by: Marvin Liu
---
drivers/net/virtio/virtio_rxtx.c
When doing xmit in-order enqueue, packets are buffered and then flushed
into avail ring. Buffered packets can be dropped due to insufficient
space. Moving stats update action just after successful avail ring
updates can guarantee correctness.
Signed-off-by: Marvin Liu
---
drivers/net/virtio
er Rx and Tx")
Cc: sta...@dpdk.org
Signed-off-by: Marvin Liu
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 27ead19fb..91df5b1d0 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -106,6 +106,48 @@ vq_ring_free
: e5f456a98d3c ("net/virtio: support in-order Rx and Tx")
Cc: sta...@dpdk.org
Signed-off-by: Marvin Liu
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 91df5b1d0..7d5c60532 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/vir
Add vhost enqueue function for single packet and meanwhile left space
for flush used ring function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5b85b832d..2b5c47145 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost
NG_SZ - PKT_BURST)
- Optimize dequeue used ring update when in_order negotiated
Marvin Liu (16):
vhost: add single packet enqueue function
vhost: unify unroll pragma parameter
vhost: add burst enqueue function for packed ring
vhost: add single packet dequeue function
vhost: add burst de
Add macro for unifying Clang/ICC/GCC unroll pragma format. Burst
functions were contained of several small loops which optimized by
compiler’s loop unrolling pragma.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 8623e91c0..30839a001 100644
Add vhost single packet dequeue function for packed ring and meanwhile
left space for shadow used ring update function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index c664b27c5..047fa7dc8 100644
--- a/lib/librte_vhost/virtio_net.c
Simplify flush shadow used ring function names as all shadow rings are
reflect to used rings. No need to emphasize ring type.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 23c0f4685..ebd6c175d 100644
--- a/lib/librte_vhost
Burst enqueue function will first check whether descriptors are cache
aligned. It will also check prerequisites in the beginning. Burst
enqueue function not support chained mbufs, single packet enqueue
function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b
Add burst dequeue function like enqueue function for packed ring, burst
dequeue function will not support chained descritpors, single packet
dequeue function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 67889c80a..9fa3c8adf
: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 9c42c7db0..14e87f670 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -42,6 +42,8 @@
#define VIRTIO_RX_USED_FLAG (0ULL | VRING_DESC_F_AVAIL | VRING_DESC_F_USED
Buffer vhost enqueue shadow ring update, flush shadow ring until
buffered descriptors number exceed one burst. Thus virtio can receive
packets at a faster frequency.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 9fa3c8adf..000648dd4 100644
Flush used flags when burst enqueue function is finished. Descriptor's
flags are pre-calculated as them will be reset by vhost.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 000648dd4..9c42c7db0 100644
--- a/lib/librte_vhost/vhost.h
+++
Vhost enqueue descriptors are updated by burst number, while vhost
dequeue descriptors are buffered. Meanwhile in dequeue function only
first descriptor is buffered. Due to these differences, split vhost
enqueue and dequeue flush functions.
Signed-off-by: Marvin Liu
diff --git a/lib
Optimize vhost device Rx datapath by separate functions. No-chained
and direct descriptors will be handled by burst and other will be
handled one by one as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index a8df74f87..066514e43
Optimize vhost device Tx datapath by separate functions. Packets can be
filled into one descriptor will be handled by burst and others will be
handled one by one as before. Pre-fetch descriptors in next two cache
lines as hardware will load two cache line data automatically.
Signed-off-by: Marvin
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 357517cdd..a7bb4ec79 100644
--- a/lib/librte_vhost/virtio_net.c
Cache address translation result and use it in next translation. Due
to limited regions are supported, buffers are most likely in same
region when doing data transmission.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index 7fb172912
Optimize vhost zero copy dequeue path like normal dequeue path.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 2418b4e45..a8df74f87 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1909,6 +1909,144
Disable software pre-fetch actions on Skylake and Cascadelake platforms.
Hardware can fetch needed data for vhost, additional software pre-fetch
will have impact on performance.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 30839a001
et/virtio: optimize Tx enqueue for packed ring")
Fixes: e5f456a98d3c ("net/virtio: support in-order Rx and Tx")
Cc: sta...@dpdk.org
Reported-by: Stephen Hemminger
Signed-off-by: Marvin Liu
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 27ead19
when in_order negotiated
Marvin Liu (15):
vhost: add single packet enqueue function
vhost: unify unroll pragma parameter
vhost: add batch enqueue function for packed ring
vhost: add single packet dequeue function
vhost: add batch dequeue function
vhost: flush vhost enqueue shadow ring by
Add vhost enqueue function for single packet and meanwhile left space
for flush used ring function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5b85b832d..520c4c6a8 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost
Add macro for unifying Clang/ICC/GCC unroll pragma format. Batch
functions were contained of several small loops which optimized by
compiler’s loop unrolling pragma.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 8623e91c0..30839a001 100644
Add vhost single packet dequeue function for packed ring and meanwhile
left space for shadow used ring update function.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5e08f7d9b..17aabe8eb 100644
--- a/lib/librte_vhost/virtio_net.c
Batch enqueue function will first check whether descriptors are cache
aligned. It will also check prerequisites in the beginning. Batch
enqueue function not support chained mbufs, single packet enqueue
function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b
Add batch dequeue function like enqueue function for packed ring, batch
dequeue function will not support chained descritpors, single packet
dequeue function will handle it.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index e241436c7..e50e137ca
Flush used flags when batched enqueue function is finished. Descriptor's
flags are pre-calculated as they will be reset by vhost.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 18a207fc6..7bf9ff9b7 100644
--- a/lib/librte_vhost/vhost.h
+++
Buffer vhost enqueue shadow ring update, flush shadow ring until
buffered descriptors number exceed one batch. Thus virtio can receive
packets at a faster frequency.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index e50e137ca..18a207fc6 100644
: Marvin Liu
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 7bf9ff9b7..f62e9ec3f 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -42,6 +42,8 @@
#define PACKED_RX_USED_FLAG (0ULL | VRING_DESC_F_AVAIL | VRING_DESC_F_USED
Vhost enqueue descriptors are updated by batch number, while vhost
dequeue descriptors are buffered. Meanwhile in dequeue function only
first descriptor is buffered. Due to these differences, split vhost
enqueue and dequeue flush functions.
Signed-off-by: Marvin Liu
diff --git a/lib
Optimize vhost device Tx datapath by separate functions. Packets can be
filled into one descriptor will be handled by batch and others will be
handled one by one as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 1b0fa2c64
Optimize vhost device Rx datapath by separate functions. No-chained
and direct descriptors will be handled by batch and other will be
handled one by one as before.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 9ab95763a..20624efdc
Cache address translation result and use it in next translation. Due
to limited regions are supported, buffers are most likely in same
region when doing data transmission.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index 7fb172912
Optimize vhost zero copy dequeue path like normal dequeue path.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index c485e7f49..9ab95763a 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1881,6 +1881,141
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index e3872e384..1e113fb3a 100644
--- a/lib/librte_vhost/virtio_net.c
Disable software pre-fetch actions on Skylake and later platforms.
Hardware can fetch needed data for vhost, additional software pre-fetch
will impact performance.
Signed-off-by: Marvin Liu
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 30839a001..5f3b42e56 100644
Data prefetch instruction can preload data into cpu’s hierarchical
cache before data access. Virtio datapath utilized this feature for
data access acceleration. As config RTE_PMD_PACKET_PREFETCH was
discarded, now packet data prefetch is enabled based on architecture.
Signed-off-by: Marvin Liu
mbuf segments number for calculating correct desc length.
Fixes: de8b3d238074 ("net/virtio: fix indirect descs in packed datapaths")
Cc: sta...@dpdk.org
Signed-off-by: Marvin Liu
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 8c8ab9889..42c4c9882 10
Data prefetch instruction can preload data into cpu’s hierarchical
cache before data access. Virtualized data paths like virtio utilized
this feature for acceleration. Since most modern cpus have support
prefetch function, we can enable packet data prefetch as default.
Signed-off-by: Marvin Liu
Like split ring, packed ring will utilize indirect ring elements when
queuing mbufs need multiple descriptors. Thus each packet will take only
one slot when having multiple segments.
Signed-off-by: Marvin Liu
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index
Add packed indirect descriptors format into virtio Tx region. When
initializing vring, packed indirect descriptors will be initialized if
ring type is packed.
Signed-off-by: Marvin Liu
diff --git a/drivers/net/virtio/virtio_ethdev.c
b/drivers/net/virtio/virtio_ethdev.c
index 013a2904e
requirements. Otherwise
will fallback to original data path.
Signed-off-by: Marvin Liu
diff --git a/doc/guides/nics/vhost.rst b/doc/guides/nics/vhost.rst
index d36f3120b..efdaf4de0 100644
--- a/doc/guides/nics/vhost.rst
+++ b/doc/guides/nics/vhost.rst
@@ -64,6 +64,11 @@ The user can specify
* dynamically allocate memory regions structure
* remove unlikely hint for in_order
v2:
* add vIOMMU support
* add dequeue offloading
* rebase code
Marvin Liu (5):
vhost: add vectorized data path
vhost: reuse packed ring functions
vhost: prepare memory regions addresses
vhost: add packed ring
301 - 400 of 417 matches
Mail list logo