Hello,
I would like to create an IP packet processor program and I choose to use DPDK 
because it is promising wrt its speed aspect.

I am trying to build a test environment to make the development a cheaper (not 
to buy HW for each developer), so I created a test setup in
- VMWare Workstation 11
- using DPDK 2.0.0
- with linux kernel 3.10.0, CentOS7
- gcc 4.8.3
- and standard, centos7 provided VMXNET3 driver, with uio_pci_generic kernel 
module
(shall I use vmxnet3-usermap.ko with dpdk 2.0.0? Where is it, how could I 
compile it?)

I set up 3 machines:
- set all machines' network interface type to VMXNET3
- set up one machine (C1) for issuing ping, its interface has an IP: 
192.168.3.21
- set up one machine (C2) for being the ping target, its interface has an IP: 
192.168.3.23
- set up one machine (BR) to act a L2 bridge using some of the examples 
provided. DPDK is compiled properly, 256x  2MB hugetables created, example 
application is executed and running without (major) error.
- three machines are connected linearly:  C1 - BR - C2 using two private 
networks on each side of BR (VMnet2 and VMnet3), so the VMs are connected by 
vSwitches

Ping reply arrives, definitely goes through BR (extra console logs), but there 
are unexpected delays with example/skeleton/basicfwd...
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1018 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=18.7 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=8.87 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=10.2 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1012 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=12.7 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=1049 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=49.8 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=9.02 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=8.74 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1007 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=8.03 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=8.96 ms
64 bytes from 192.168.3.23: icmp_seq=19 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=20 ttl=64 time=9.27 ms
64 bytes from 192.168.3.23: icmp_seq=21 ttl=64 time=1008 ms
...

When I switched on BR to multi_process/client_server_mp, with 2 client 
processes the result was almost the same:
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=3.50 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1002 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=3.94 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=2003 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=2.29 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=3002 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=2.66 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=2.87 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=2.88 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=2.70 ms
...

And when I switched on BR to test-pdm, the ping result was kind of normal 
(every commandline switch left as default)
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=3.52 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=33.2 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=3.97 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=25.5 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=61.1 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=36.3 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=35.5 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=33.0 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=5.32 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=14.6 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=34.5 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=4.67 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=55.0 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=4.93 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=5.98 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=5.41 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=21.0 ms
...

Though I think these values are still quite high I can accept that as this is a 
virtualized environment.

Could someone please explain to me what is going on with the basicfwd and 
client-server exapmles? According to my understanding each packet should go 
through BR as fast as possible, but it seems that the rte_eth_rx_burst 
retrieves packets only when there are at least 2 packets on the RX queue of the 
NIC. At least most of the times as there are cases (rarely - according to my 
console log) when it can retrieve 1 packet also and sometimes only 3 packets 
can be retrieved...

What is the difference that makes test-pdm working without major delay and the 
others don't?



Thanks,
Sandor

Reply via email to