> -----Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil > Sent: Tuesday, October 25, 2016 2:48 PM > To: Morten Br?rup <mb at smartsharesystems.com> > Cc: Richardson, Bruce <bruce.richardson at intel.com>; Wiles, Keith > <keith.wiles at intel.com>; dev at dpdk.org; Olivier Matz > <olivier.matz at 6wind.com>; Oleg Kuporosov <olegk at mellanox.com> > Subject: Re: [dpdk-dev] mbuf changes > > On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Br?rup wrote: > > Comments inline. > > I'm only replying to the nb_segs bits here. > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson > > > Sent: Tuesday, October 25, 2016 1:14 PM > > > To: Adrien Mazarguil > > > Cc: Morten Br?rup; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg > > > Kuporosov > > > Subject: Re: [dpdk-dev] mbuf changes > > > > > > On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil wrote: > > > > On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Br?rup wrote: > > > > > Comments inline. > > > > > > > > > > Med venlig hilsen / kind regards > > > > > - Morten Br?rup > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com] > > > > > > Sent: Tuesday, October 25, 2016 11:39 AM > > > > > > To: Bruce Richardson > > > > > > Cc: Wiles, Keith; Morten Br?rup; dev at dpdk.org; Olivier Matz; Oleg > > > > > > Kuporosov > > > > > > Subject: Re: [dpdk-dev] mbuf changes > > > > > > > > > > > > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson wrote: > > > > > > > On Mon, Oct 24, 2016 at 04:11:33PM +0000, Wiles, Keith wrote: > > > > > > [...] > > > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup > > > > > > <mb at smartsharesystems.com> wrote: > > > > > > [...] > > > > > > > > > 5. > > > > > > > > > > > > > > > > > > And here?s something new to think about: > > > > > > > > > > > > > > > > > > m->next already reveals if there are more segments to a > > > packet. > > > > > > Which purpose does m->nb_segs serve that is not already covered > > > by > > > > > > m- > > > > > > >next? > > > > > > > > > > > > > > It is duplicate info, but nb_segs can be used to check the > > > > > > > validity > > > > > > of > > > > > > > the next pointer without having to read the second mbuf > > > cacheline. > > > > > > > > > > > > > > Whether it's worth having is something I'm happy enough to > > > > > > > discuss, though. > > > > > > > > > > > > Although slower in some cases than a full blown "next packet" > > > > > > pointer, nb_segs can also be conveniently abused to link several > > > > > > packets and their segments in the same list without wasting > > > space. > > > > > > > > > > I don?t understand that; can you please elaborate? Are you abusing > > > m->nb_segs as an index into an array in your application? If that is > > > the case, and it is endorsed by the community, we should get rid of m- > > > >nb_segs and add a member for application specific use instead. > > > > > > > > Well, that's just an idea, I'm not aware of any application using > > > > this, however the ability to link several packets with segments seems > > > > useful to me (e.g. buffering packets). Here's a diagram: > > > > > > > > .-----------. .-----------. .-----------. .-----------. .--- > > > --- > > > > | pkt 0 | | seg 1 | | seg 2 | | pkt 1 | | > > > pkt 2 > > > > | next --->| next --->| next --->| next --->| > > > ... > > > > | nb_segs 3 | | nb_segs 1 | | nb_segs 1 | | nb_segs 1 | | > > > > `-----------' `-----------' `-----------' `-----------' `--- > > > --- > > > > I see. It makes it possible to refer to a burst of packets (with segments > > or not) by a single mbuf reference, as an alternative to the current > design pattern of using an array and length (struct rte_mbuf **mbufs, > unsigned count). > > > > This would require implementation in the PMDs etc. > > > > And even in this case, m->nb_segs does not need to be an integer, but could > > be replaced by a single bit indicating if the segment is a > continuation of a packet or the beginning (alternatively the end) of a > packet, i.e. the bit can be set for either the first or the last segment in > the packet.
We do need nb_segs - at least for TX. That's how TX function calculates how many TXDs it needs to allocate(and fill). Of-course it can re-scan whole chain of segments to count them, but I think it would slowdown things even more. Though yes, I suppose it can be moved to the second cahe-line. Konstantin > > Sure however if we keep the current definition, a single bit would not be > enough as it must be nonzero for the buffer to be valid. I think a 8 bit > field is not that expensive for a counter. > > > It is an almost equivalent alternative to the fundamental design pattern of > > using an array of mbuf with count, which is widely implemented > in DPDK. And m->next still lives in the second cache line, so I don't see any > gain by this. > > That's right, it does not have to live in the first cache line, my only > concern was its entire removal. > > > I still don't get how m->nb_segs can be abused without m->next. > > By "abused" I mean that applications are not supposed to pass this kind of > mbuf lists directly to existing mbuf-handling functions (TX burst, > rte_pktmbuf_free() and so on), however these same applications (even PMDs) > can do so internally temporarily because it's so simple. > > The next pointer of the last segment of a packet must still be set to NULL > every time a packet is retrieved from such a list to be processed. > > > > However, nb_segs may be a good candidate for demotion, along with > > > possibly the port value, or the reference count. > > Yes, I think that's fine as long as it's kept somewhere. > > -- > Adrien Mazarguil > 6WIND