> > On 10/14/2020 1:15 PM, Li,Rongqing wrote: > > > > > >> -----Original Message----- > >> From: Loftus, Ciara [mailto:ciara.lof...@intel.com] > >> Sent: Friday, October 02, 2020 12:24 AM > >> To: Li,Rongqing <lirongq...@baidu.com> > >> Cc: dev@dpdk.org > >> Subject: RE: [PATCH][v2] net/af_xdp: avoid to unnecessary allocation and > free > >> mbuf in rx path > >> > >>> > >>> when receive packets, the max bunch number of mbuf are allocated if > >>> hardware does not receive the max bunch number packets, it will free > >>> redundancy mbuf, that is low-performance > >>> > >>> so optimize rx performance, by allocating number of mbuf based on > >>> result of xsk_ring_cons__peek, to avoid to redundancy allocation, and > >>> free mbuf when receive packets > >> > >> Hi, > >> > >> Thanks for the patch and fixing the issue I raised. > > > > Thanks for your finding > > > >> With my testing so far I haven't measured an improvement in > performance > >> with the patch. > >> Do you have data to share which shows the benefit of your patch? > >> > >> I agree the potential excess allocation of mbufs for the fill ring is not > >> the > most > >> optimal, but if doing it does not significantly impact the performance I > would be > >> in favour of keeping that approach versus touching the cached_cons > outside of > >> libbpf which is unconventional. > >> > >> If a benefit can be shown and we proceed with the approach, I would > suggest > >> creating a new function for the cached consumer rollback eg. > >> xsk_ring_cons_cancel() or similar, and add a comment describing what it > does. > >> > > > > Thanks for your test. > > > > Yes, it has benefit > > > > We first see this issue when do some send performance, topo is like below > > > > Qemu with vhost-user ----->ovs------->xdp interface > > > > Qemu sends udp packets, xdp has not packets to receive, but it must be > polled by ovs, and xdp must allocated/free mbuf unnecessary, with this > packet, we has about 5% benefit for sending, this depends on flow table > complexity > > > > > > When do rx benchmark, if packets per batch is reaching about 32, the > benefit is very little. > > If packets per batch is far less than 32, we can see the cycle per packet is > reduced obviously > > > > Hi Li, Ciara, > > What is the status of this patch, is the patch justified and is a new versions > requested/expected?
Apologies for the delay, I missed your reply Li. With the data you've provided I think the patch is justified. I think the rollback requires some explanation in the code as it may not be immediately clear what is happening. I suggest a v3 with either a comment above the rollback, or a new function as described in my previous mail, also with a comment. Thanks for the patch. Ciara