You are right, Olivier, thanks for your suggestion - it looks even better. I've tested this version and the performance is great - will send a v2 shortly.
Regards, Alex > -----Original Message----- > From: Olivier Matz <olivier.m...@6wind.com> > Sent: Thursday, March 19, 2020 5:30 > To: Alexander Kozyrev <akozy...@mellanox.com> > Cc: dev@dpdk.org; Slava Ovsiienko <viachesl...@mellanox.com>; Matan > Azrad <ma...@mellanox.com>; Thomas Monjalon > <tho...@monjalon.net>; sta...@dpdk.org > Subject: Re: [PATCH] mbuf: optimize memory loads during mbuf freeing > > Hi, > > On Mon, Mar 16, 2020 at 06:31:40PM +0000, Alexander Kozyrev wrote: > > Introduction of pinned external buffers doubled memory loads in the > > rte_pktmbuf_prefree_seg() function. Analysis of the generated assembly > > code shows unnecessary load of the pool field of the rte_mbuf structure. > > Here is the snippet of the assembly for "if (!RTE_MBUF_DIRECT(m))": > > Before the change the code was: > > movq 0x18(%rbx), %rax // load the ol_flags field > > test %r13, %rax // check if ol_flags equals to 0x60...0 > > jz 0x9a8718 <Block 2> // jump out to "if (m->next != NULL)" > > After the change the code becomed: > > movq 0x18(%rbx), %rax // load ol_flags > > test %r14, %rax // check if ol_flags equals to 0x60...0 > > jnz 0x9bea38 <Block 2> // jump in to "if > (!RTE_MBUF_HAS_EXTBUF(m)" > > movq 0x48(%rbx), %rax // load the pool field > > jmp 0x9bea78 <Block 7> // jump out to "if (m->next != NULL)" > > Look like this absolutely unneeded memory load of the pool field is an > > optimization for the external buffer case in GCC (4.8.5), since Clang > > generates the same assembly for both before and after the chenge > versions. > > Plus, GCC favors the extrnal buffer case over the simple case. > > This assembly code layout causes the performance degradation because > > the > > rte_pktmbuf_prefree_seg() function is a part of a very hot path. > > Workaround this compilation issue by moving the check for pinned > > buffer apart from the check for external buffer and restore the > > initial code flow that favors the direct mbuf case over the external one. > > > > Fixes: 6ef1107ad4c6 ("mbuf: detach mbuf with pinned external buffer") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Alexander Kozyrev <akozy...@mellanox.com> > > Acked-by: Viacheslav Ovsiienko <viachesl...@mellanox.com> > > --- > > lib/librte_mbuf/rte_mbuf.h | 14 ++++++-------- > > 1 file changed, 6 insertions(+), 8 deletions(-) > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > > index 34679e0..ab9d3f5 100644 > > --- a/lib/librte_mbuf/rte_mbuf.h > > +++ b/lib/librte_mbuf/rte_mbuf.h > > @@ -1335,10 +1335,9 @@ static inline int > __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m) > > if (likely(rte_mbuf_refcnt_read(m) == 1)) { > > > > if (!RTE_MBUF_DIRECT(m)) { > > - if (!RTE_MBUF_HAS_EXTBUF(m) || > > - !RTE_MBUF_HAS_PINNED_EXTBUF(m)) > > - rte_pktmbuf_detach(m); > > - else if (__rte_pktmbuf_pinned_extbuf_decref(m)) > > + rte_pktmbuf_detach(m); > > + if (RTE_MBUF_HAS_PINNED_EXTBUF(m) && > > + __rte_pktmbuf_pinned_extbuf_decref(m)) > > return NULL; > > } > > > [...] > > Reading the previous code again, it was correct but not easy to understand, > especially the: > > if (!RTE_MBUF_HAS_EXTBUF(m) || !RTE_MBUF_HAS_PINNED_EXTBUF(m)) > > Knowing we already checked it is not a direct mbuf, it is equivalent to: > > if (!RTE_MBUF_HAS_PINNED_EXTBUF(m)) > > I think the objective was to avoid an access to the pool flags if not > necessary. > > Completely removing the test as you did is also functionally OK, because > rte_pktmbuf_detach() also does the check, and the code is even clearer. > > I wonder however if doing this wouldn't avoid an access to the pool flags for > mbufs which have the IND_ATTACHED flags: > > if (!RTE_MBUF_DIRECT(m)) { > rte_pktmbuf_detach(m); > if (RTE_MBUF_HAS_EXTBUF(m) && > RTE_MBUF_HAS_PINNED_EXTBUF(m) && > __rte_pktmbuf_pinned_extbuf_decref(m)) > return NULL; > } > > What do you think? > > Nit: if you wish to send a v2, there are few english fixes that could be done > (becomed, chenge, extrnal) > > Thanks