> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev,
> Konstantin
> Sent: Thursday, November 5, 2020 2:56 PM
> 
> > >
> > > >
> > > > On Thu, Nov 05, 2020 at 11:34:18AM +0000, Ananyev, Konstantin
> wrote:
> > > > >
> > > > >
> > > > > > On Thu, Nov 05, 2020 at 11:26:51AM +0300, Andrew Rybchenko
> wrote:
> > > > > > > On 11/5/20 10:46 AM, Olivier Matz wrote:
> > > > > > > > On Thu, Nov 05, 2020 at 12:15:49AM +0000, Ananyev,
> Konstantin wrote:
> > > > > > > >>
> > > > > > > >> Hi Olivier,
> > > > > > > >>
> > > > > > > >>> m->nb_seg must be reset on mbuf free whatever the value
> of m->next,
> > > > > > > >>> because it can happen that m->nb_seg is != 1. For
> instance in this
> > > > > > > >>> case:
> > > > > > > >>>
> > > > > > > >>>   m1 = rte_pktmbuf_alloc(mp);
> > > > > > > >>>   rte_pktmbuf_append(m1, 500);
> > > > > > > >>>   m2 = rte_pktmbuf_alloc(mp);
> > > > > > > >>>   rte_pktmbuf_append(m2, 500);
> > > > > > > >>>   rte_pktmbuf_chain(m1, m2);
> > > > > > > >>>   m0 = rte_pktmbuf_alloc(mp);
> > > > > > > >>>   rte_pktmbuf_append(m0, 500);
> > > > > > > >>>   rte_pktmbuf_chain(m0, m1);
> > > > > > > >>>
> > > > > > > >>> As rte_pktmbuf_chain() does not reset nb_seg in the
> initial m1
> > > > > > > >>> segment (this is not required), after this code the
> mbuf chain
> > > > > > > >>> have 3 segments:
> > > > > > > >>>   - m0: next=m1, nb_seg=3
> > > > > > > >>>   - m1: next=m2, nb_seg=2
> > > > > > > >>>   - m2: next=NULL, nb_seg=1
> > > > > > > >>>
> > > > > > > >>> Freeing this mbuf chain will not restore nb_seg=1 in
> the second
> > > > > > > >>> segment.
> > > > > > > >>
> > > > > > > >> Hmm, not sure why is that?
> > > > > > > >> You are talking about freeing m1, right?
> > > > > > > >> rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > > > >> {
> > > > > > > >>        ...
> > > > > > > >>        if (m->next != NULL) {
> > > > > > > >>                         m->next = NULL;
> > > > > > > >>                         m->nb_segs = 1;
> > > > > > > >>                 }
> > > > > > > >>
> > > > > > > >> m1->next != NULL, so it will enter the if() block,
> > > > > > > >> and will reset both next and nb_segs.
> > > > > > > >> What I am missing here?
> > > > > > > >> Thinking in more generic way, that change:
> > > > > > > >>  -             if (m->next != NULL) {
> > > > > > > >>  -                     m->next = NULL;
> > > > > > > >>  -                     m->nb_segs = 1;
> > > > > > > >>  -             }
> > > > > > > >>  +             m->next = NULL;
> > > > > > > >>  +             m->nb_segs = 1;
> > > > > > > >
> > > > > > > > Ah, sorry. I oversimplified the example and now it does
> not
> > > > > > > > show the issue...
> > > > > > > >
> > > > > > > > The full example also adds a split() to break the mbuf
> chain
> > > > > > > > between m1 and m2. The kind of thing that would be done
> for
> > > > > > > > software TCP segmentation.
> > > > > > > >
> > > > > > >
> > > > > > > If so, may be the right solution is to care about nb_segs
> > > > > > > when next is set to NULL on split? Any place when next is
> set
> > > > > > > to NULL. Just to keep the optimization in a more generic
> place.
> > > > >
> > > > >
> > > > > > The problem with that approach is that there are already
> several
> > > > > > existing split() or trim() implementations in different DPDK-
> based
> > > > > > applications. For instance, we have some in 6WINDGate. If we
> force
> > > > > > applications to set nb_seg to 1 when resetting next, it has
> to be
> > > > > > documented because it is not straightforward.
> > > > >
> > > > > I think it is better to go that way.
> > > > > From my perspective it seems natural to reset nb_seg at same
> time
> > > > > we reset next, otherwise inconsistency will occur.
> > > >
> > > > While it is not explicitly stated for nb_segs, to me it was clear
> that
> > > > nb_segs is only valid in the first segment, like for many fields
> (port,
> > > > ol_flags, vlan, rss, ...).
> > > >
> > > > If we say that nb_segs has to be valid in any segments, it means
> that
> > > > chain() or split() will have to update it in all segments, which
> is not
> > > > efficient.
> > >
> > > Why in all?
> > > We can state that nb_segs on non-first segment should always equal
> 1.
> > > As I understand in that case, both split() and chain() have to
> update nb_segs
> > > only for head mbufs, rest ones will remain untouched.
> >
> > Well, anyway, I think it's strange to have a constraint on m->nb_segs
> for
> > non-first segment. We don't have that kind of constraints for other
> fields.
> 
> True, we don't. But this is one of the fields we consider critical
> for proper work of mbuf alloc/free mechanism.
> 

I am not sure that requiring m->nb_segs == 1 on non-first segments will provide 
any benefits.

E.g. the second segment of a three-segment chain will still have m->next != 
NULL, so it cannot be used as a gate to prevent accessing m->next.

I might have overlooked something, though.

> >
> >
> > >
> > > >
> > > > Saying that nb_segs has to be valid for the first and last
> segment seems
> > > > really odd to me. What is the logic behind that except keeping
> this test
> > > > as is?
> > > >
> > > > In any case, it has to be better documented.
> > > >
> > > >
> > > > Olivier
> > > >
> > > >
> > > > > > I think the approach from
> > > > > > this patch is safer.
> > > > >
> > > > > It might be easier from perspective that changes in less places
> are required,
> > > > > Though I think that this patch will introduce some performance
> drop.
> > > > > As now each mbuf_prefree_seg() will cause update of 2 cache
> lines unconditionally.
> > > > >
> > > > > > By the way, for 21.11, if we are able to do some
> optimizations and have
> > > > > > both pool (index?) and next in the first cache line, we may
> reconsider
> > > > > > the fact that next and nb_segs are already set for new
> allocated mbufs,
> > > > > > because it is not straightforward either.
> > > > >
> > > > > My suggestion - let's put future optimization discussion aside
> for now,
> > > > > and concentrate on that particular patch.
> > > > >
> > > > > >
> > > > > > > > After this operation, we have 2 mbuf chain:
> > > > > > > >  - m0 with 2 segments, the last one has next=NULL but
> nb_seg=2
> > > > > > > >  - new_m with 1 segment
> > > > > > > >
> > > > > > > > Freeing m0 will not restore nb_seg=1 in the second
> segment.
> > > > > > > >
> > > > > > > >> Assumes that it is ok to have an mbuf with
> > > > > > > >> nb_seg > 1 and next == NULL.
> > > > > > > >> Which seems wrong to me.
> > > > > > > >
> > > > > > > > I don't think it is wrong: nb_seg is just ignored when
> not in the first
> > > > > > > > segment, and there is nothing saying it should be set to
> 1. Typically,
> > > > > > > > rte_pktmbuf_chain() does not change it, and I guess it's
> the same for
> > > > > > > > many similar functions in applications.
> > > > > > > >
> > > > > > > > Olivier
> > > > > > > >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>> This is expected that mbufs stored in pool have their
> > > > > > > >>> nb_seg field set to 1.
> > > > > > > >>>
> > > > > > > >>> Fixes: 8f094a9ac5d7 ("mbuf: set mbuf fields while in
> pool")
> > > > > > > >>> Cc: sta...@dpdk.org
> > > > > > > >>>
> > > > > > > >>> Signed-off-by: Olivier Matz <olivier.m...@6wind.com>
> > > > > > > >>> ---
> > > > > > > >>>  lib/librte_mbuf/rte_mbuf.c |  6 ++----
> > > > > > > >>>  lib/librte_mbuf/rte_mbuf.h | 12 ++++--------
> > > > > > > >>>  2 files changed, 6 insertions(+), 12 deletions(-)
> > > > > > > >>>
> > > > > > > >>> diff --git a/lib/librte_mbuf/rte_mbuf.c
> b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > >>> index 8a456e5e64..e632071c23 100644
> > > > > > > >>> --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > >>> +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > >>> @@ -129,10 +129,8 @@
> rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque)
> > > > > > > >>>
> > > > > > > >>>       rte_mbuf_ext_refcnt_set(m->shinfo, 1);
> > > > > > > >>>       m->ol_flags = EXT_ATTACHED_MBUF;
> > > > > > > >>> -     if (m->next != NULL) {
> > > > > > > >>> -             m->next = NULL;
> > > > > > > >>> -             m->nb_segs = 1;
> > > > > > > >>> -     }
> > > > > > > >>> +     m->next = NULL;
> > > > > > > >>> +     m->nb_segs = 1;
> > > > > > > >>>       rte_mbuf_raw_free(m);
> > > > > > > >>>  }
> > > > > > > >>>
> > > > > > > >>> diff --git a/lib/librte_mbuf/rte_mbuf.h
> b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > >>> index a1414ed7cd..ef5800c8ef 100644
> > > > > > > >>> --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > >>> +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > >>> @@ -1329,10 +1329,8 @@ rte_pktmbuf_prefree_seg(struct
> rte_mbuf *m)
> > > > > > > >>>                               return NULL;
> > > > > > > >>>               }
> > > > > > > >>>
> > > > > > > >>> -             if (m->next != NULL) {
> > > > > > > >>> -                     m->next = NULL;
> > > > > > > >>> -                     m->nb_segs = 1;
> > > > > > > >>> -             }
> > > > > > > >>> +             m->next = NULL;
> > > > > > > >>> +             m->nb_segs = 1;
> > > > > > > >>>
> > > > > > > >>>               return m;
> > > > > > > >>>
> > > > > > > >>> @@ -1346,10 +1344,8 @@ rte_pktmbuf_prefree_seg(struct
> rte_mbuf *m)
> > > > > > > >>>                               return NULL;
> > > > > > > >>>               }
> > > > > > > >>>
> > > > > > > >>> -             if (m->next != NULL) {
> > > > > > > >>> -                     m->next = NULL;
> > > > > > > >>> -                     m->nb_segs = 1;
> > > > > > > >>> -             }
> > > > > > > >>> +             m->next = NULL;
> > > > > > > >>> +             m->nb_segs = 1;
> > > > > > > >>>               rte_mbuf_refcnt_set(m, 1);
> > > > > > > >>>
> > > > > > > >>>               return m;
> > > > > > > >>> --
> > > > > > > >>> 2.25.1
> > > > > > > >>
> > > > > > >

Reply via email to