> >
> > >
> > > On Thu, Nov 05, 2020 at 11:34:18AM +0000, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > On Thu, Nov 05, 2020 at 11:26:51AM +0300, Andrew Rybchenko wrote:
> > > > > > On 11/5/20 10:46 AM, Olivier Matz wrote:
> > > > > > > On Thu, Nov 05, 2020 at 12:15:49AM +0000, Ananyev, Konstantin
> > > > > > > wrote:
> > > > > > >>
> > > > > > >> Hi Olivier,
> > > > > > >>
> > > > > > >>> m->nb_seg must be reset on mbuf free whatever the value of
> > > > > > >>> m->next,
> > > > > > >>> because it can happen that m->nb_seg is != 1. For instance in
> > > > > > >>> this
> > > > > > >>> case:
> > > > > > >>>
> > > > > > >>> m1 = rte_pktmbuf_alloc(mp);
> > > > > > >>> rte_pktmbuf_append(m1, 500);
> > > > > > >>> m2 = rte_pktmbuf_alloc(mp);
> > > > > > >>> rte_pktmbuf_append(m2, 500);
> > > > > > >>> rte_pktmbuf_chain(m1, m2);
> > > > > > >>> m0 = rte_pktmbuf_alloc(mp);
> > > > > > >>> rte_pktmbuf_append(m0, 500);
> > > > > > >>> rte_pktmbuf_chain(m0, m1);
> > > > > > >>>
> > > > > > >>> As rte_pktmbuf_chain() does not reset nb_seg in the initial m1
> > > > > > >>> segment (this is not required), after this code the mbuf chain
> > > > > > >>> have 3 segments:
> > > > > > >>> - m0: next=m1, nb_seg=3
> > > > > > >>> - m1: next=m2, nb_seg=2
> > > > > > >>> - m2: next=NULL, nb_seg=1
> > > > > > >>>
> > > > > > >>> Freeing this mbuf chain will not restore nb_seg=1 in the second
> > > > > > >>> segment.
> > > > > > >>
> > > > > > >> Hmm, not sure why is that?
> > > > > > >> You are talking about freeing m1, right?
> > > > > > >> rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > > >> {
> > > > > > >> ...
> > > > > > >> if (m->next != NULL) {
> > > > > > >> m->next = NULL;
> > > > > > >> m->nb_segs = 1;
> > > > > > >> }
> > > > > > >>
> > > > > > >> m1->next != NULL, so it will enter the if() block,
> > > > > > >> and will reset both next and nb_segs.
> > > > > > >> What I am missing here?
> > > > > > >> Thinking in more generic way, that change:
> > > > > > >> - if (m->next != NULL) {
> > > > > > >> - m->next = NULL;
> > > > > > >> - m->nb_segs = 1;
> > > > > > >> - }
> > > > > > >> + m->next = NULL;
> > > > > > >> + m->nb_segs = 1;
> > > > > > >
> > > > > > > Ah, sorry. I oversimplified the example and now it does not
> > > > > > > show the issue...
> > > > > > >
> > > > > > > The full example also adds a split() to break the mbuf chain
> > > > > > > between m1 and m2. The kind of thing that would be done for
> > > > > > > software TCP segmentation.
> > > > > > >
> > > > > >
> > > > > > If so, may be the right solution is to care about nb_segs
> > > > > > when next is set to NULL on split? Any place when next is set
> > > > > > to NULL. Just to keep the optimization in a more generic place.
> > > >
> > > >
> > > > > The problem with that approach is that there are already several
> > > > > existing split() or trim() implementations in different DPDK-based
> > > > > applications. For instance, we have some in 6WINDGate. If we force
> > > > > applications to set nb_seg to 1 when resetting next, it has to be
> > > > > documented because it is not straightforward.
> > > >
> > > > I think it is better to go that way.
> > > > From my perspective it seems natural to reset nb_seg at same time
> > > > we reset next, otherwise inconsistency will occur.
> > >
> > > While it is not explicitly stated for nb_segs, to me it was clear that
> > > nb_segs is only valid in the first segment, like for many fields (port,
> > > ol_flags, vlan, rss, ...).
> > >
> > > If we say that nb_segs has to be valid in any segments, it means that
> > > chain() or split() will have to update it in all segments, which is not
> > > efficient.
> >
> > Why in all?
> > We can state that nb_segs on non-first segment should always equal 1.
> > As I understand in that case, both split() and chain() have to update
> > nb_segs
> > only for head mbufs, rest ones will remain untouched.
>
> Well, anyway, I think it's strange to have a constraint on m->nb_segs for
> non-first segment. We don't have that kind of constraints for other fields.
True, we don't. But this is one of the fields we consider critical
for proper work of mbuf alloc/free mechanism.
>
>
> >
> > >
> > > Saying that nb_segs has to be valid for the first and last segment seems
> > > really odd to me. What is the logic behind that except keeping this test
> > > as is?
> > >
> > > In any case, it has to be better documented.
> > >
> > >
> > > Olivier
> > >
> > >
> > > > > I think the approach from
> > > > > this patch is safer.
> > > >
> > > > It might be easier from perspective that changes in less places are
> > > > required,
> > > > Though I think that this patch will introduce some performance drop.
> > > > As now each mbuf_prefree_seg() will cause update of 2 cache lines
> > > > unconditionally.
> > > >
> > > > > By the way, for 21.11, if we are able to do some optimizations and
> > > > > have
> > > > > both pool (index?) and next in the first cache line, we may reconsider
> > > > > the fact that next and nb_segs are already set for new allocated
> > > > > mbufs,
> > > > > because it is not straightforward either.
> > > >
> > > > My suggestion - let's put future optimization discussion aside for now,
> > > > and concentrate on that particular patch.
> > > >
> > > > >
> > > > > > > After this operation, we have 2 mbuf chain:
> > > > > > > - m0 with 2 segments, the last one has next=NULL but nb_seg=2
> > > > > > > - new_m with 1 segment
> > > > > > >
> > > > > > > Freeing m0 will not restore nb_seg=1 in the second segment.
> > > > > > >
> > > > > > >> Assumes that it is ok to have an mbuf with
> > > > > > >> nb_seg > 1 and next == NULL.
> > > > > > >> Which seems wrong to me.
> > > > > > >
> > > > > > > I don't think it is wrong: nb_seg is just ignored when not in the
> > > > > > > first
> > > > > > > segment, and there is nothing saying it should be set to 1.
> > > > > > > Typically,
> > > > > > > rte_pktmbuf_chain() does not change it, and I guess it's the same
> > > > > > > for
> > > > > > > many similar functions in applications.
> > > > > > >
> > > > > > > Olivier
> > > > > > >
> > > > > > >>
> > > > > > >>
> > > > > > >>> This is expected that mbufs stored in pool have their
> > > > > > >>> nb_seg field set to 1.
> > > > > > >>>
> > > > > > >>> Fixes: 8f094a9ac5d7 ("mbuf: set mbuf fields while in pool")
> > > > > > >>> Cc: sta...@dpdk.org
> > > > > > >>>
> > > > > > >>> Signed-off-by: Olivier Matz <olivier.m...@6wind.com>
> > > > > > >>> ---
> > > > > > >>> lib/librte_mbuf/rte_mbuf.c | 6 ++----
> > > > > > >>> lib/librte_mbuf/rte_mbuf.h | 12 ++++--------
> > > > > > >>> 2 files changed, 6 insertions(+), 12 deletions(-)
> > > > > > >>>
> > > > > > >>> diff --git a/lib/librte_mbuf/rte_mbuf.c
> > > > > > >>> b/lib/librte_mbuf/rte_mbuf.c
> > > > > > >>> index 8a456e5e64..e632071c23 100644
> > > > > > >>> --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > >>> +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > >>> @@ -129,10 +129,8 @@ rte_pktmbuf_free_pinned_extmem(void *addr,
> > > > > > >>> void *opaque)
> > > > > > >>>
> > > > > > >>> rte_mbuf_ext_refcnt_set(m->shinfo, 1);
> > > > > > >>> m->ol_flags = EXT_ATTACHED_MBUF;
> > > > > > >>> - if (m->next != NULL) {
> > > > > > >>> - m->next = NULL;
> > > > > > >>> - m->nb_segs = 1;
> > > > > > >>> - }
> > > > > > >>> + m->next = NULL;
> > > > > > >>> + m->nb_segs = 1;
> > > > > > >>> rte_mbuf_raw_free(m);
> > > > > > >>> }
> > > > > > >>>
> > > > > > >>> diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > > >>> b/lib/librte_mbuf/rte_mbuf.h
> > > > > > >>> index a1414ed7cd..ef5800c8ef 100644
> > > > > > >>> --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > >>> +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > >>> @@ -1329,10 +1329,8 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf
> > > > > > >>> *m)
> > > > > > >>> return NULL;
> > > > > > >>> }
> > > > > > >>>
> > > > > > >>> - if (m->next != NULL) {
> > > > > > >>> - m->next = NULL;
> > > > > > >>> - m->nb_segs = 1;
> > > > > > >>> - }
> > > > > > >>> + m->next = NULL;
> > > > > > >>> + m->nb_segs = 1;
> > > > > > >>>
> > > > > > >>> return m;
> > > > > > >>>
> > > > > > >>> @@ -1346,10 +1344,8 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf
> > > > > > >>> *m)
> > > > > > >>> return NULL;
> > > > > > >>> }
> > > > > > >>>
> > > > > > >>> - if (m->next != NULL) {
> > > > > > >>> - m->next = NULL;
> > > > > > >>> - m->nb_segs = 1;
> > > > > > >>> - }
> > > > > > >>> + m->next = NULL;
> > > > > > >>> + m->nb_segs = 1;
> > > > > > >>> rte_mbuf_refcnt_set(m, 1);
> > > > > > >>>
> > > > > > >>> return m;
> > > > > > >>> --
> > > > > > >>> 2.25.1
> > > > > > >>
> > > > > >