Hi Jieqiang, This looks like an interesting optimization but you need to check that the 'mbufs to be freed should be coming from the same mempool' rule holds true. This won't be the case on NUMA systems (VPP creates 1 buffer pool per NUMA). This should be easy to check with eg. 'vec_len (vm->buffer_main->buffer_pools) == 1'.
For the rest, I think we do not use DPDK mbuf refcounting at all as we maintain our own anyway, but someone more knowledgeable than me should confirm. I'd be curious to see if we can measure a real performance difference in CSIT. Best ben > -----Original Message----- > From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Jieqiang Wang > Sent: vendredi 17 septembre 2021 06:07 > To: vpp-dev <vpp-dev@lists.fd.io> > Cc: Lijian Zhang <lijian.zh...@arm.com>; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com>; Govindarajan Mohandoss > <govindarajan.mohand...@arm.com>; Ruifeng Wang <ruifeng.w...@arm.com>; > Tianyu Li <tianyu...@arm.com>; Feifei Wang <feifei.wa...@arm.com>; nd > <n...@arm.com> > Subject: [vpp-dev] Enable DPDK tx offload flag mbuf-fast-free on VPP > vector mode > > Hi VPP maintainers, > > > > Recently VPP has upgraded the DPDK version to DPDK-21.08, which includes > two optimization patches[1][2] from Arm DPDK team. With the mbuf-fast-free > flag, the two patches add code segment to accelerate mbuf free in PMD TX > path for i40e driver, which shows quite obvious performance improvement > from DPDK L3FWD benchmarking results. > > > > I tried to verify the benefits that those optimization patches can bring > up to VPP, but find out that mbuf-fast-free flag is not enabled in > VPP+DPDK by default. > > Applying DPDK mbuf-fast-free has some constraints, e.g, > > * mbufs to be freed should be coming from the same mempool > * ref_cnt == 1 always in mbuf meta-data when user apps call DPDK > rte_eth_tx_burst () > * No TX checksum offload > * No jumble frame > > But VPP vector mode(set by adding ‘no-tx-checksum-offload’ and ‘no-multi- > seg’ parameters in dpdk section of the startup.conf) seems to satisfy all > the requirements. So I made a few code changes shown as below to set mbuf- > fast-free flag by default in VPP vector mode and did some benchmarking for > IPv4 routing test cases with 1 flow/10k flows. The benchmarking results > show both throughput improvement and CPU cycles saved regarding DPDK > transmit function. > > > > So any thought on enabling mbuf-fast-free tx offload flag in VPP vector > mode? Any feedback is welcome :) > > > > Code Changes: > > > > diff --git a/src/plugins/dpdk/device/init.c > b/src/plugins/dpdk/device/init.c > > index f7c1cc106..0fbdd2317 100644 > > --- a/src/plugins/dpdk/device/init.c > > +++ b/src/plugins/dpdk/device/init.c > > @@ -398,6 +398,8 @@ dpdk_lib_init (dpdk_main_t * dm) > > xd->port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_SCATTER; > > xd->flags |= DPDK_DEVICE_FLAG_MAYBE_MULTISEG; > > } > > + if (dm->conf->no_multi_seg && dm->conf->no_tx_checksum_offload) > > + xd->port_conf.txmode.offloads |= DEV_TX_OFFLOAD_MBUF_FAST_FREE; > > > > xd->tx_q_used = clib_min (dev_info.max_tx_queues, tm- > >n_vlib_mains); > > > > Benchmark Results: > > > > 1 flow, bidirectional > > Throughput(Mpps): > > > > Original > > Patched > > Ratio > > N1SDP > > 11.62 > > 12.44 > > +7.06% > > ThunderX2 > > 9.52 > > 10.16 > > +6.30% > > Dell 8268 > > 17.82 > > 18.20 > > +2.13% > > > > CPU cycles overhead for DPDK transmit function(recorded by Perf tools): > > > > Original > > Patched > > Ratio > > N1SDP > > 13.08% > > 5.53% > > -7.55% > > ThunderX2 > > 11.01% > > 6.68% > > -4.33% > > Dell 8268 > > 10.78% > > 7.35% > > -3.43% > > > > 10k flows, bidirectional > > Throughput(Mpps): > > > > Original > > Patched > > Ratio > > N1SDP > > 8.48 > > 9.0 > > +6.13% > > ThunderX2 > > 8.84 > > 9.26 > > +4.75% > > Dell 8268 > > 15.04 > > 15.40 > > +2.39% > > > > CPU cycles overhead for DPDK transmit function(recorded by Perf tools): > > > > Original > > Patched > > Ratio > > N1SDP > > 10.58% > > 4.54% > > -6.04% > > ThunderX2 > > 12.92% > > 6.63% > > -6.29% > > Dell 8268 > > 10.36% > > 7.97% > > -2.39% > > > > [1] http://git.dpdk.org/dpdk/commit/?h=v21.08- > rc1&id=be8ff6210851fdacbe00033259b7dc5426e95589 > <http://git.dpdk.org/dpdk/commit/?h=v21.08- > rc1&id=be8ff6210851fdacbe00033259b7dc5426e95589> > > [2] http://git.dpdk.org/dpdk/commit/?h=v21.08- > rc1&id=95e7bb6a5fc9e371e763b11ec15786e4d574ef8e > <http://git.dpdk.org/dpdk/commit/?h=v21.08- > rc1&id=95e7bb6a5fc9e371e763b11ec15786e4d574ef8e> > > > > Best Regards, > > Jieqiang Wang > > > > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you.
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20155): https://lists.fd.io/g/vpp-dev/message/20155 Mute This Topic: https://lists.fd.io/mt/85669132/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-