Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than Linux Kernel

Mohsin Kazmi via lists.fd.io Thu, 20 Oct 2022 08:56:28 -0700

VPP performs best when packets are batched together.

You should observe performance improvement with iperf multiflow test if there 
are no other bottlenecks in your setup.


Best Regards,
Mohsin

From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> on behalf of Xiaodong Xu 
<stid.s...@gmail.com>
Date: Thursday, October 20, 2022 at 4:30 AM
To: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io>
Subject: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than 
Linux Kernel
I tried the native virtio driver with gso enabled, the throughput (without any 
tuning) is close to 16Gbps. It's a big improvement, though there is still some 
gap between vpp and Linux kernel. Does it suggest that the virtio driver in 
dpdk has some issues?

Regards,
Xiaodong

On Wed, Oct 12, 2022 at 3:40 AM Mohsin Kazmi via 
lists.fd.io<http://lists.fd.io> 
<sykazmi=cisco....@lists.fd.io<mailto:cisco....@lists.fd.io>> wrote:
Hi,

You can use VPP native virtio driver in VM.


./dpdk-devbind.py -b vfio-pci 00:03.0 00:04.0

cd /home/vpp


set loggin class pci level debug

set loggin class virtio level debug

create int virtio 0000:00:03.0 gso-enabled

create int virtio 0000:00:04.0 gso-enabled



https://s3-docs.fd.io/vpp/22.10/cli-reference/clis/clicmd_src_vnet_gso.html is 
the command to use software segmentation of packets when interface doesn’t 
support offload. While you need to enable GSO on the interfaces instead of 
chunking in the software.



Please pin the VM threads and Iperf threads to get the performance results.

-Best Regards,
Mohsin Kazmi

From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> on behalf of Xiaodong Xu 
<stid.s...@gmail.com<mailto:stid.s...@gmail.com>>
Date: Thursday, October 6, 2022 at 10:24 PM
To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>>
Subject: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than 
Linux Kernel
Hi Wentian,

Please take a look at 
https://s3-docs.fd.io/vpp/22.10/cli-reference/clis/clicmd_src_vnet_gso.html for 
the GSO feature in VPP.
I tried to turn on the TSO option too, however VPP doesn't seem to be able to 
forward traffic once TSO is turned on.

The test case for vhost-user driver I had run is similar to the topo shown at 
https://wiki.fd.io/view/VPP/Use_VPP_to_Chain_VMs_Using_Vhost-User_Interface#Two_Chained_QEMU_Instances_with_VPP_Vhost-User_Interfaces.
 The difference is that I don't have the 'VPP l2 xc1' / 'VPP l2 xc3' instances 
in the diagram, also the 'testpmd' application is replaced with VPP in my 
testing. You do need a VPP instance running on the host machine though, in 
order to bridge the two VMs together.

Xiaodong
On Thu, Oct 6, 2022 at 7:40 AM Bu Wentian 
<buwent...@outlook.com<mailto:buwent...@outlook.com>> wrote:
Hi Xiaodong,

Could you please tell me how to enable GSO in VPP? I read the startup.conf and 
searched the VPP documents but didn't find options about GSO. I found a TSO 
option in the default startup.conf. According to the comment, the "tso on" 
option must be enabled with "enable-tcp-udp-checksum" set. However, I tried to 
do so, but the VPP failed to start after enabling the two options.

I also read the document you mentioned 
(https://www.redhat.com/en/blog/hands-vhost-user-warm-welcome-dpdk), and I 
found a similar example in VPP document: 
https://s3-docs.fd.io/vpp/22.06/usecases/vhost/index.html . But the two 
examples both seem to using VPP as a bridge in the host, and run test in VMs. 
In my application scene, I need to run VPP in the guest rather than host. Can 
vhost-user work in this scene?


Sincerely,
Wentian


________________________________
发件人: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> 代表 Xiaodong Xu 
<stid.s...@gmail.com<mailto:stid.s...@gmail.com>>
发送时间: 2022年10月6日 0:40
收件人: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>>
主题: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than Linux 
Kernel

I actually tried to enable GSO on both input and output interfaces in VPP, and 
got a little bit of difference for the result. In most cases the throughput 
would be the same, but in some cases I see a burst of 4.5Gbps (3x of the 
original rate).

As Benoit said, even without GSO the throughput is way too low for VPP. I also 
see that KVM has the options 'gso', 'tso4' for either host or guest 
(https://libvirt.org/formatdomain.html#setting-nic-driver-specific-options<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flibvirt.org%2Fformatdomain.html%23setting-nic-driver-specific-options&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VkaJIs2LlwhiSgaZpFUu3JKc2wzvqaxESQX%2B0Va5%2Bu4%3D&reserved=0>)
 but they are all turned on by default.

In the meanwhile, I wonder if there is a GRO option (or maybe jumbo frame?) for 
VPP? As I suppose the GSO option applies to sending rather than receiving.

Regards,
Xiaodong

On Wed, Oct 5, 2022 at 8:55 AM Bu Wentian 
<buwent...@outlook.com<mailto:buwent...@outlook.com>> wrote:
Hi Benoit,
I checked the packet sizes by tcpdump, and the result is similar to what you 
said. I ran iperf server on M2 and iperf client on M1, and tcpdump on both. I 
found that on M1(sender), most of the packet length in tcpdump were 65160, 
irrespective of which dataplane the router used. However, on M2(receiver), when 
using Linux kernel as router, most of the packet length were 65160; when using 
VPP as router, most of the packet length varied from 1448 to 5792. I have 
pasted some output of tcpdump below.

I have tried some tuning tricks on VPP VM such as CPU pinning,  but didn't get 
significant improve on throughput. Xiaodong also said that the bottleneck is 
not CPU. I am not sure whether it is GSO that caused the performance difference 
between Linux routing and VPP routing. I still have a problem: since the Linux 
routing and VPP routing both work on Layer3, which means that they don't care 
about what the payload of IP packet is, and just forward them. And the IP 
packets are from M1, so there should be no difference between the two 
experiments. So why could the router GSO can cause the performance difference? 
And what can I do to improve VPP performance if it is GSO reason?

Sincerely,
Wentian


Part of the tcpdump output on M1:
15:28:24.563697 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2657974901:2657985037, ack 1, win 32, options [nop,nop,TS val 3264786600 ecr 
615633982], length 10136
15:28:24.564599 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2657985037:2658040061, ack 1, win 32, options [nop,nop,TS val 3264786601 ecr 
615633982], length 55024
15:28:24.564709 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2658040061:2658105221, ack 1, win 32, options [nop,nop,TS val 3264786601 ecr 
615633982], length 65160
15:28:24.564778 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2658105221:2658170381, ack 1, win 32, options [nop,nop,TS val 3264786601 ecr 
615633983], length 65160
15:28:24.564833 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2658170381:2658235541, ack 1, win 32, options [nop,nop,TS val 3264786601 ecr 
615633983], length 65160
15:28:24.564900 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2658235541:2658300701, ack 1, win 32, options [nop,nop,TS val 3264786601 ecr 
615633983], length 65160
15:28:24.564956 IP nt1m1.33998 > 10.250.1.100.5201: Flags [P.], seq 
2658300701:2658329661, ack 1, win 32, options [nop,nop,TS val 3264786602 ecr 
615633983], length 28960


Part of the tcpdump output on M2 (when using VPP routing):
15:33:41.573257 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699471685:2699476029, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 4344
15:33:41.573262 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699476029:2699477477, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 1448
15:33:41.573268 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699477477:2699478925, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 1448
15:33:41.573271 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699478925:2699480373, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 1448
15:33:41.573279 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699480373:2699484717, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 4344
15:33:41.573287 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699484717:2699487613, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 2896
15:33:41.573289 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699487613:2699489061, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 1448
15:33:41.573296 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699489061:2699491957, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 2896
15:33:41.573303 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699491957:2699496301, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 4344
15:33:41.573311 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699496301:2699499197, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 2896
15:33:41.573312 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699499197:2699500645, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 1448
15:33:41.573323 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699500645:2699503541, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 2896
15:33:41.573328 IP 10.250.0.100.43298 > nt1m2.5201: Flags [P.], seq 
2699503541:2699504989, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 1448
15:33:41.573328 IP 10.250.0.100.43298 > nt1m2.5201: Flags [.], seq 
2699504989:2699507885, ack 1, win 32, options [nop,nop,TS val 3265103413 ecr 
615950794], length 2896

Part of the tcpdump output on M2 (when using Linux routing):
15:54:06.222861 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93174477:93239637, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160
15:54:06.222861 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93239637:93304797, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160
15:54:06.222861 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93304797:93369957, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160
15:54:06.222900 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93369957:93435117, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160
15:54:06.222924 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93435117:93500277, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160
15:54:06.223035 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93500277:93565437, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160
15:54:06.223035 IP 10.250.0.100.47000 > nt1m2.5201: Flags [P.], seq 
93565437:93630597, ack 1, win 32, options [nop,nop,TS val 3266328062 ecr 
617175444], length 65160



________________________________
发件人: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> 代表 Benoit Ganne (bganne) via 
lists.fd.io<https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.fd.io%2F&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YzImQhc%2Ftc5MUQCmH6LrieMjDhro0zQA4puZQdFcpoc%3D&reserved=0>
 <bganne=cisco....@lists.fd.io<mailto:cisco....@lists.fd.io>>
发送时间: 2022年10月3日 16:25
收件人: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>>
主题: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than Linux 
Kernel

Hi Wentian, Xiaodong,

When testing VM-to-VM iperf (ie TCP) throughput like Wentian does, the most 
important factor is whether GSO is turned on: when using Linux as a router, it 
is by default whereas when using VPP as a router it is not.
GSO means Linux is going to use 64K bytes TCP packets whereas no GSO means VPP 
is going to use MTU-sized packets (probably 1500 bytes).
You can check by using tcpdump to look at packet on both sides of the iperf and 
see packet sizes: when using Linux routing, I'd expect to see GSO packets (10's 
of Kbytes) and non-GSO packets (1500 bytes) with VPP.

With that said, the performances you have with VPP are still really slow, even 
without GSO. Getting high-performance in VMs environment is tricky, you must 
takes special care about qemu configuration (virtio queue depth, IO mode etc), 
cpu-pinning (vCPU threads, vhost threads, vpp workers, etc).
If your usecase is to use this setup as a traditional router connected through 
physical interfaces and forwarding packets over a network, and this is a test 
setup, I'd recommend to change the setup to match reality, this is going to be 
much simpler to setup & optimize correctly.

Best
ben

> -----Original Message-----
> From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> On Behalf Of Xiaodong Xu
> Sent: Monday, October 3, 2022 1:18
> To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>
> Subject: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse
> than Linux Kernel
>
> Hi Wentian,
>
> I ran a perf test with a similar topo to your setup and got the same
> result. The only difference is that the iperf server is running on the
> host rather than in another VM. So the throughput is close to 20Gbps with
> Linux kernel data plane while only 1.5Gbps with VPP dataplane.
> I think we might have run into the same issue as
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.fd.io%2Fg%2Fvpp-dev%2Fmessage%2F9571&amp;data=05%7C01%7C%7Cb45008c6d2934c1d3b3a08daa518cdd3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638003823231871290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=cGXsKO%2FlgwzRnwjQIJdH%2BM5IbASu4sT6d1J12o8BB%2F8%3D&amp;reserved=0<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.fd.io%2Fg%2Fvpp-dev%2Fmessage%2F9571&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=boTX%2FhXcPsrEnjIV%2FVf5c%2B7xh96NTEHj77P9ugQX344%3D&reserved=0>.
>
> Before that, I tried TRex and Pktgen-DPDK, and the results were different.
> Usually the throughput would be a bit higher with VPP dataplane than Linux
> kernel data plane, but not much. When I was checking the CPU usage with
> VPP dataplane (I change the rx-mode to interrupt from polling), it is
> pretty low (< 10%), so it sounds to me the bottleneck is not the CPU.
>
> I also ran a similar test with vhost-user driver, following
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2Fen%2Fblog%2Fhands-vhost-user-warm-welcome-dpdk&amp;data=05%7C01%7C%7Cb45008c6d2934c1d3b3a08daa518cdd3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638003823231871290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=bShUqDr9kokLFDAaMTRvtbcvG4pvuRstEAmS2PpW9K8%3D&amp;reserved=0<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2Fen%2Fblog%2Fhands-vhost-user-warm-welcome-dpdk&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8XSatrasv0eBeEBqgCi06l3zMoOryqcV%2FHCTn9jUekk%3D&reserved=0>,
>  I got
> much better throughput (close to 20Gbps with Pktgen-DPDK), thanks to the
> capability to access share memory brought by vhost-user. There are much
> less memory copies than the above user case where vhost-net is used.
>
> I don't have a chance to run any test in the cloud environment but still
> believe it is an important use case for VPP to run in KVM setup. If anyone
> in the VPP community can shed some light on this issue, I believe both
> Wentian and I would appreciate it very much.
>
> Xiaodong
>
> On Sun, Oct 2, 2022 at 8:10 AM Bu Wentian 
> <buwent...@outlook.com<mailto:buwent...@outlook.com>
> <mailto:buwent...@outlook.com> > wrote:
>
>
>        Hi Xiaodong,
>
>        Thank you for your reply!
>
>        I'm exactly using the VPP 22.06 installed through apt from FD.io
> repo. The linux-cp and linux-nl plugins also come with the VPP from the
> repo.
>
>        The virtual NICs on my VMs use the virtio(assigned by
> "model=virtio" when installing with virt-install). The VMs are connected
> through libvirtd networks (auto-create bridges). In my experiments, I can
> ping M2 from M1, and the neighbor table and routing table in VPP seem to
> be correct.
>
>        I'm not sure which driver the VPP is using (maybe vfio-pci?). The
> packet counter looked like this:
>        vpp# show int GE1
>                      Name               Idx    State  MTU
> (L3/IP4/IP6/MPLS)     Counter          Count
>        GE1                               1      up          9000/0/0/0
> rx packets               3719027
>
> rx bytes              5630430079
>
> tx packets               1107500
>
> tx bytes                73176221
>
> drops                         76
>
> ip4                      3718961
>
> ip6                           61
>
> tx-error                       1
>        vpp# show int GE2
>                      Name               Idx    State  MTU
> (L3/IP4/IP6/MPLS)     Counter          Count
>        GE2                               2      up          9000/0/0/0
> rx packets               1107520
>
> rx bytes                73177597
>
> tx packets               3718998
>
> tx bytes              5630427889
>
> drops                         63
>
> ip4                      1107455
>
> ip6                           62
>
> tx-error                    1162
>
>        Could you give me more information about how can I get details of
> the error packets? And the main problem is that the VPP forwarding
> performance is much worse than Linux kernel (2Gbps vs 26Gbps), is there
> any method to improve it, or what can I do to find the reason?
>
>        Sincerely,
>        Wentian
>
>
>
>
>
>
> ________________________________
>
>        发件人: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
> <mailto:vpp-dev@lists.fd.io>  <vpp-
> d...@lists.fd.io<mailto:d...@lists.fd.io> <mailto:vpp-dev@lists.fd.io> > 代表 
> Xiaodong Xu
> <stid.s...@gmail.com<mailto:stid.s...@gmail.com> <mailto:stid.s...@gmail.com> 
> >
>        发送时间: 2022年10月2日 1:05
>        收件人: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
> <mailto:vpp-dev@lists.fd.io>  <vpp-
> d...@lists.fd.io<mailto:d...@lists.fd.io> <mailto:vpp-dev@lists.fd.io> >
>        主题: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse
> than Linux Kernel
>
>        Which vpp version are you using in your testing? As of VPP 22.06,
> linux-cp and linux-nl plugins have been supported and the binary builds
> are available at FD.io repository (https://s3-
> docs.fd.io/vpp/22.10/gettingstarted/installing/ubuntu.html<https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs.fd.io%2Fvpp%2F22.10%2Fgettingstarted%2Finstalling%2Fubuntu.html&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Dvj543%2F9D01KoaNAUEF8UbFJQCkj%2F%2FDRN5klYNfK7LQ%3D&reserved=0>
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fs3-
> docs.fd.io<https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs.fd.io%2F&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dKeYfyxlnYquppL7tEZYhV9VYH4cwmkiyGO8HOIaNMY%3D&reserved=0>%2Fvpp%2F22.10%2Fgettingstarted%2Finstalling%2Fubuntu.html&data=
> 05%7C01%7C%7Cbadce86b78394ff18b8408daa3cf2dc3%7C84df9e7fe9f640afb435aaaaaa
> aaaaaa%7C1%7C0%7C638002407476279512%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata
> =y5dLTkQx%2B3E%2Bg2VEpwMos6YhNZSD0mSlOdfGR6t77tY%3D&reserved=0> ).
>
>        Can you install vpp from the FD.io repo and try again? (BTW, you
> might want to disable the ping plugin if linux-cp is used) I would also
> suggest you add static routes to rule out any issue with FRR (in which
> case you don't actually need linux-cp plugin).
>
>        In the meanwhile, I wonder what uio driver you are using for your
> VPP machine (igb_uio, uio_pci_generic, or vfio-pci). Assuming you are
> running virtio-net driver on the guest machine, and you are connecting the
> M1 and R1, R1 and M2 with linux kernel bridge.
>
>        If you still run into any issue, you may want to check the neighbor
> table and routing table in the VPP system first, and maybe the interface
> counter as well.
>
>        Regards,
>        Xiaodong
>
>        On Sat, Oct 1, 2022 at 3:55 AM Bu Wentian 
> <buwent...@outlook.com<mailto:buwent...@outlook.com>
> <mailto:buwent...@outlook.com> > wrote:
>
>
>                Hi everyone,
>                I am a beginner of VPP, and I'm trying to use VPP+FRR on KVM
> VMs as routers. I have installed VPP and FRR on Ubuntu 20.04.5 VMs, and
> made them running in a seperated network namespace. I use VPP Linux-cp
> plugin to synchronize the route from kernel stack to VPP. The VPP and FRR
> seems to work, but when I use iperf3 to test the throughput, I find the
> performance of VPP is not good.
>
>                I created a very simple topology to test the throughput:
>                M1 ----- R1(with VPP) ----- M2
>                M1, M2 are also Ubuntu VMs(without VPP), in different
> subnets. I ran iperf3 server on M1 and client on M2, but only got about
> 2.1Gbps throughput, which is significantly worse than using Linux kernel
> as a router(about 26.1Gbps).
>
>                I made another experiment on the topology:
>                M1 ------ R1(with VPP) ---- R2(with VPP) ------ M2
>                The iperf3 result is even worse (only 1.6Gbps).
>
>                I also noticed that many retransmissions happend during the
> iperf3 test. If I use Linux kernel as router rather than VPP, no
> retransmission will happen.
>                Part of iperf3 output:
>                [ ID] Interval           Transfer     Bitrate         Retr
> Cwnd
>
>                [  5]   0.00-1.00   sec   166 MBytes  1.39 Gbits/sec   23
> 344 KBytes
>                [  5]   1.00-2.00   sec   179 MBytes  1.50 Gbits/sec   49
> 328 KBytes
>                [  5]   2.00-3.00   sec   203 MBytes  1.70 Gbits/sec   47
> 352 KBytes
>                [  5]   3.00-4.00   sec   203 MBytes  1.70 Gbits/sec   54
> 339 KBytes
>                [  5]   4.00-5.00   sec   211 MBytes  1.77 Gbits/sec   59
> 325 KBytes
>
>
>                Another phenomenon I found is that when I ran iperf3
> directly on the R1 and R2, I got 0 throughput at all. The output of iperf3
> is like this:
>                [ ID] Interval           Transfer     Bitrate         Retr
> Cwnd
>                [  5]   0.00-1.00   sec   324 KBytes  2.65 Mbits/sec    4
> 8.74 KBytes
>                [  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1
> 8.74 KBytes
>                [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>                [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1
> 8.74 KBytes
>                [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>                [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>                [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1
> 8.74 KBytes
>                [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>                [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>                [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>                - - - - - - - - - - - - - - - - - - - - - - - - -
>                [ ID] Interval           Transfer     Bitrate         Retr
>                [  5]   0.00-10.00  sec   324 KBytes   266 Kbits/sec    7
> sender
>                [  5]   0.00-10.00  sec  0.00 Bytes  0.00 bits/sec
> receiver
>
>
>                All my VMs use 4vcpus and 8G RAM. The host machine has
> 16Cores(32 Threads) and 32GB RAM.
>                The VMs are connected by libvirtd networks.
>                I installed the VPP +FRR following this tutorial:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fipng.ch%2Fs%2Farticles%2F2021%2F12%2F23%2Fvpp-playground.html&amp;data=05%7C01%7C%7Cb45008c6d2934c1d3b3a08daa518cdd3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638003823231871290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=c66dSPbOEQY3%2Ftm84G40e%2Fu8%2BtP18pMYLWKRWcu9hzg%3D&amp;reserved=0<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fipng.ch%2Fs%2Farticles%2F2021%2F12%2F23%2Fvpp-playground.html&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=j4%2BZxcyq3zVxgQmprKTBkmziYnrB%2FoyPR2vnXLkIWmA%3D&reserved=0>
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fipng.ch%2F&amp;data=05%7C01%7C%7Cb45008c6d2934c1d3b3a08daa518cdd3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638003823231871290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=zSmgAvE8hB4YbbT9DO%2FsFAMbRUDlsV1fcdgUyduxx0Q%3D&amp;reserved=0<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fipng.ch%2F&data=05%7C01%7C%7Cf6c220e7703a4c204c8808daa6f15b71%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638005852829199513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2B7bG94seosm0vU5MdUyV5%2Bk7rWyebVtVps7qx3cOaW4%3D&reserved=0>
> %2Fs%2Farticles%2F2021%2F12%2F23%2Fvpp-
> playground.html&data=05%7C01%7C%7Cbadce86b78394ff18b8408daa3cf2dc3%7C84df9
> e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638002407476279512%7CUnknown%7CTWFpb
> GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C3000%7C%7C%7C&sdata=XHwcWCBs%2F1Fk6mjLlul6S5d390cru%2BzjoTXFclbmg5g%3D&r
> eserved=0>
>                The VPP startup.conf is in the attachment.
>
>                I want to know why the VPP throughput is worse than Linux
> kernel, and what can I do to improve it (I hope it better than Linux
> kernel forwarding).  I have searched on google for the solution but got
> nothing helpful. It will be appreciated if anyone could give me a help.
> Please contact me if more information or logs are needed.
>
>
>                Sincerely,
>                Wentian Bu
>
>
>
>
>
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22057): https://lists.fd.io/g/vpp-dev/message/22057
Mute This Topic: https://lists.fd.io/mt/94080585/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than Linux Kernel

Reply via email to