Hi Jianfeng, On Mon, Jun 26, 2017 at 12:03:33AM +0800, Tan, Jianfeng wrote: > > > On 6/23/2017 10:43 PM, Jiayu Hu wrote: > > Generic Receive Offload (GRO) is a widely used SW-based offloading > > technique to reduce per-packet processing overhead. It gains performance > > by reassembling small packets into large ones. Therefore, we propose to > > support GRO in DPDK. > > > > To enable more flexibility to applications, DPDK GRO is implemented as > > a user library. Applications explicitly use the GRO library to merge > > small packets into large ones. DPDK GRO provides two reassembly modes: > > lightweigth mode and heavyweight mode. If applications want to merge > > packets in a simple way, they can select lightweight mode API. If > > applications need more fine-grained controls, they can select heavyweigth > > mode API. > > > > This patchset is to support TCP/IPv4 GRO in DPDK. The first patch is to > > provide a GRO API framework. The second patch is to support TCP/IPv4 GRO. > > The last patch is to enable TCP/IPv4 GRO in testpmd. > > > > We perform many iperf tests to see the performance gains from DPDK GRO. > > > > The test environment is: > > a. two 25Gbps physical ports (p0 and p1) are linked together. Assign p0 > > to one networking namespace and assign p1 to DPDK; > > b. enable TSO for p0. Run iperf client on p0; > > c. launch testpmd with p1 and a vhost-user port, and run it in csum > > forwarding mode. Select TCP HW checksum calculation for the > > vhost-user port in csum forwarding engine. And for better > > performance, we select IPv4 and TCP HW checksum calculation for p1 > > too; > > d. launch a VM with one CPU core and a virtio-net port. The VM OS is > > ubuntu 16.04 whose virtio-net driver supports GRO. Enables RX csum > > offloading and mrg_rxbuf for the VM. Iperf server runs in the VM; > > e. to run iperf tests, we need to avoid the csum forwarding engine > > compulsorily changes packet mac addresses. SO in our tests, we > > comment these codes out (line701 ~ line704 in csumonly.c). > > > > In each test, we run iperf with the following three configurations: > > - single flow and single TCP stream > > - multiple flows and single TCP stream > > - single flow and parallel TCP streams > > To me, flow == TCP stream; so could you explain what does flow mean?
Sorry, I use inappropriate terms. 'flow' means TCP connection here. And 'multiple TCP streams' means parallel iperf-client threads. Thanks, Jiayu > > > > > We run above iperf tests on three scenatios: > > s1: disabling kernel GRO and enabling DPDK GRO > > s2: disabling kernel GRO and disabling DPDK GRO > > s3: enabling kernel GRO and disabling DPDK GRO > > Comparing the throughput of s1 with s2, we can see the performance gains > > from DPDK GRO. Comparing the throughput of s1 and s3, we can compare DPDK > > GRO performance with kernel GRO performance. > > > > Test results: > > - DPDK GRO throughput is almost 2 times than the throughput of no > > DPDK GRO and no kernel GRO; > > - DPDK GRO throughput is almost 1.2 times than the throughput of > > kernel GRO. > > > > Change log > > ========== > > v6: > > - avoid checksum validation and calculation > > - enable to process IP fragmented packets > > - add a command in testpmd > > - update documents > > - modify rte_gro_timeout_flush and rte_gro_reassemble_burst > > - rename veriable name > > v5: > > - fix some bugs > > - fix coding style issues > > v4: > > - implement DPDK GRO as an application-used library > > - introduce lightweight and heavyweight working modes to enable > > fine-grained controls to applications > > - replace cuckoo hash tables with simpler table structure > > v3: > > - fix compilation issues. > > v2: > > - provide generic reassembly function; > > - implement GRO as a device ability: > > add APIs for devices to support GRO; > > add APIs for applications to enable/disable GRO; > > - update testpmd example. > > > > Jiayu Hu (3): > > lib: add Generic Receive Offload API framework > > lib/gro: add TCP/IPv4 GRO support > > app/testpmd: enable TCP/IPv4 GRO > > > > app/test-pmd/cmdline.c | 125 +++++++++ > > app/test-pmd/config.c | 37 +++ > > app/test-pmd/csumonly.c | 5 + > > app/test-pmd/testpmd.c | 3 + > > app/test-pmd/testpmd.h | 11 + > > config/common_base | 5 + > > doc/guides/rel_notes/release_17_08.rst | 7 + > > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 34 +++ > > lib/Makefile | 2 + > > lib/librte_gro/Makefile | 51 ++++ > > lib/librte_gro/rte_gro.c | 221 ++++++++++++++++ > > lib/librte_gro/rte_gro.h | 195 ++++++++++++++ > > lib/librte_gro/rte_gro_tcp.c | 393 > > ++++++++++++++++++++++++++++ > > lib/librte_gro/rte_gro_tcp.h | 188 +++++++++++++ > > lib/librte_gro/rte_gro_version.map | 12 + > > mk/rte.app.mk | 1 + > > 16 files changed, 1290 insertions(+) > > create mode 100644 lib/librte_gro/Makefile > > create mode 100644 lib/librte_gro/rte_gro.c > > create mode 100644 lib/librte_gro/rte_gro.h > > create mode 100644 lib/librte_gro/rte_gro_tcp.c > > create mode 100644 lib/librte_gro/rte_gro_tcp.h > > create mode 100644 lib/librte_gro/rte_gro_version.map > >