Hi Olivier, > -----Original Message----- > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > Sent: Monday, April 06, 2015 10:50 PM > To: Ananyev, Konstantin; dev at dpdk.org > Cc: zoltan.kiss at linaro.org; Richardson, Bruce > Subject: Re: [PATCH v3 1/5] mbuf: fix clone support when application uses > private mbuf data > > Hi Konstantin, > > Thanks for your comments. > > On 04/02/2015 07:21 PM, Ananyev, Konstantin wrote: > > Hi Olivier, > > > >> -----Original Message----- > >> From: Olivier Matz [mailto:olivier.matz at 6wind.com] > >> Sent: Tuesday, March 31, 2015 8:23 PM > >> To: dev at dpdk.org > >> Cc: Ananyev, Konstantin; zoltan.kiss at linaro.org; Richardson, Bruce; > >> Olivier Matz > >> Subject: [PATCH v3 1/5] mbuf: fix clone support when application uses > >> private mbuf data > >> > >> From: Olivier Matz <olivier.matz at 6wind.com> > >> > >> Add a new private_size field in mbuf structure that should > >> be initialized at mbuf pool creation. This field contains the > >> size of the application private data in mbufs. > >> > >> Introduce new static inline functions rte_mbuf_from_indirect() > >> and rte_mbuf_to_baddr() to replace the existing macros, which > >> take the private size in account when attaching and detaching > >> mbufs. > >> > >> Signed-off-by: Olivier Matz <olivier.matz at 6wind.com> > >> --- > >> app/test-pmd/testpmd.c | 1 + > >> examples/vhost/main.c | 4 +-- > >> lib/librte_mbuf/rte_mbuf.c | 1 + > >> lib/librte_mbuf/rte_mbuf.h | 77 > >> +++++++++++++++++++++++++++++++++++----------- > >> 4 files changed, 63 insertions(+), 20 deletions(-) > >> > >> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c > >> index 3057791..c5a195a 100644 > >> --- a/app/test-pmd/testpmd.c > >> +++ b/app/test-pmd/testpmd.c > >> @@ -425,6 +425,7 @@ testpmd_mbuf_ctor(struct rte_mempool *mp, > >> mb->tx_offload = 0; > >> mb->vlan_tci = 0; > >> mb->hash.rss = 0; > >> + mb->priv_size = 0; > >> } > >> > >> static void > >> diff --git a/examples/vhost/main.c b/examples/vhost/main.c > >> index c3fcb80..e44e82f 100644 > >> --- a/examples/vhost/main.c > >> +++ b/examples/vhost/main.c > >> @@ -139,7 +139,7 @@ > >> /* Number of descriptors per cacheline. */ > >> #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct > >> vring_desc)) > >> > >> -#define MBUF_EXT_MEM(mb) (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb)) > >> +#define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) != (mb)) > >> > >> /* mask of enabled ports */ > >> static uint32_t enabled_port_mask = 0; > >> @@ -1550,7 +1550,7 @@ attach_rxmbuf_zcp(struct virtio_net *dev) > >> static inline void pktmbuf_detach_zcp(struct rte_mbuf *m) > >> { > >> const struct rte_mempool *mp = m->pool; > >> - void *buf = RTE_MBUF_TO_BADDR(m); > >> + void *buf = rte_mbuf_to_baddr(m); > >> uint32_t buf_ofs; > >> uint32_t buf_len = mp->elt_size - sizeof(*m); > >> m->buf_physaddr = rte_mempool_virt2phy(mp, m) + sizeof(*m); > >> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c > >> index 526b18d..e095999 100644 > >> --- a/lib/librte_mbuf/rte_mbuf.c > >> +++ b/lib/librte_mbuf/rte_mbuf.c > >> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp, > >> m->pool = mp; > >> m->nb_segs = 1; > >> m->port = 0xff; > >> + m->priv_size = 0; > > > > Why it is 0? > > Shouldn't it be the same calulations as in detach() below: > > m->priv_size = /*get private size from mempool private*/; > > m->buf_addr = (char *)m + sizeof(struct rte_mbuf) + m->priv_size; > > m->buf_len = mp->elt_size - sizeof(struct rte_mbuf) - m->priv_size; > > ? > > It's 0 because we also have in the function (not visible in the > patch): > > m->buf_addr = (char *)m + sizeof(struct rte_mbuf);
Yep, that's why as I wrote above, I think we need to setup here all 3 fields: priv_size, buf_addr, buf_len exactly in the same way as in detach(). > > It means that an application that wants to use a private area has > to provide another init function derived from this default function. After your changes, attach/free and other functions from public mbuf API rely on priv_size being set properly. So I suppose 'official' pktmbuf_init() should also set it in a proper manner. > This was already the case before the patch series. Before this patch series, we don't have priv_size, so we have nothing to setup. > > As we discussed in previous mail, I plan to propose a rework of > mbuf pool initialization in another series, and my initial idea was to > change this at the same time. But on the other hand it does not hurt > to do this change now. I'll include it in next version. Ok. > > > > BTW, don't see changes in rte_pktmbuf_pool_init() to setup > > mbp_priv->mbuf_data_room_size properly. > > Without that changes, how can people start using that feature? > > It seems that the only way now - setup priv_size and buf_len for each mbuf > > manually. > > It's the same reason than above. To use a private are, the user has > to provide its own function that sets up data_room_size, derived from > this pool_init default function. This was also the case before the > patch series. > > > > > >> } > >> > >> /* do some sanity checks on a mbuf: panic if it fails */ > >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > >> index 17ba791..932fe58 100644 > >> --- a/lib/librte_mbuf/rte_mbuf.h > >> +++ b/lib/librte_mbuf/rte_mbuf.h > >> @@ -317,18 +317,51 @@ struct rte_mbuf { > >> /* uint64_t unused:8; */ > >> }; > >> }; > >> + > >> + /** Size of the application private data. In case of an indirect > >> + * mbuf, it stores the direct mbuf private data size. */ > >> + uint16_t priv_size; > >> } __rte_cache_aligned; > >> > >> /** > >> - * Given the buf_addr returns the pointer to corresponding mbuf. > >> + * Return the mbuf owning the data buffer address of an indirect mbuf. > >> + * > >> + * @param mi > >> + * The pointer to the indirect mbuf. > >> + * @return > >> + * The address of the direct mbuf corresponding to buffer_addr. > >> */ > >> -#define RTE_MBUF_FROM_BADDR(ba) (((struct rte_mbuf *)(ba)) - 1) > >> +static inline struct rte_mbuf * > >> +rte_mbuf_from_indirect(struct rte_mbuf *mi) > >> +{ > >> + struct rte_mbuf *md; > >> + > >> + /* mi->buf_addr and mi->priv_size correspond to buffer and > >> + * private size of the direct mbuf */ > >> + md = (struct rte_mbuf *)((char *)mi->buf_addr - sizeof(*mi) - > >> + mi->priv_size); > > > > (uintptr_t)mi->buf_addr? > > Any clue why (uintptr_t) would be better than (char *) ? No big difference really, just looks a bit better to me :) > By the way, I added this cast because it would not compile with > g++ (and probably with icc too). > > > > >> + return md; > >> +} > >> > >> /** > >> - * Given the pointer to mbuf returns an address where it's buf_addr > >> - * should point to. > >> + * Return the buffer address embedded in the given mbuf. > >> + * > >> + * The user must ensure that m->priv_size corresponds to the > >> + * private size of this mbuf, which is not the case for indirect > >> + * mbufs. > >> + * > >> + * @param md > >> + * The pointer to the mbuf. > >> + * @return > >> + * The address of the data buffer owned by the mbuf. > >> */ > >> -#define RTE_MBUF_TO_BADDR(mb) (((struct rte_mbuf *)(mb)) + 1) > >> +static inline char * > > > > Might be better to return 'void *' here. > > Ok, as m->buf_addr is a (void *). > > > > >> +rte_mbuf_to_baddr(struct rte_mbuf *md) > >> +{ > >> + char *buffer_addr; > > > > uintptr_t buffer_addr? > > Same question than above, I don't really see why it's better than > (char *). > > > > >> + buffer_addr = (char *)md + sizeof(*md) + md->priv_size; > >> + return buffer_addr; > >> +} > >> > >> /** > >> * Returns TRUE if given mbuf is indirect, or FALSE otherwise. > >> @@ -688,6 +721,7 @@ static inline struct rte_mbuf > >> *rte_pktmbuf_alloc(struct rte_mempool *mp) > >> > >> /** > >> * Attach packet mbuf to another packet mbuf. > >> + * > >> * After attachment we refer the mbuf we attached as 'indirect', > >> * while mbuf we attached to as 'direct'. > >> * Right now, not supported: > >> @@ -701,7 +735,6 @@ static inline struct rte_mbuf > >> *rte_pktmbuf_alloc(struct rte_mempool *mp) > >> * @param md > >> * The direct packet mbuf. > >> */ > >> - > >> static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct > >> rte_mbuf *md) > >> { > >> RTE_MBUF_ASSERT(RTE_MBUF_DIRECT(md) && > >> @@ -712,6 +745,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf > >> *mi, struct rte_mbuf *md) > >> mi->buf_physaddr = md->buf_physaddr; > >> mi->buf_addr = md->buf_addr; > >> mi->buf_len = md->buf_len; > >> + mi->priv_size = md->priv_size; > >> > >> mi->next = md->next; > >> mi->data_off = md->data_off; > >> @@ -732,7 +766,8 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf > >> *mi, struct rte_mbuf *md) > >> } > >> > >> /** > >> - * Detach an indirect packet mbuf - > >> + * Detach an indirect packet mbuf. > >> + * > >> * - restore original mbuf address and length values. > >> * - reset pktmbuf data and data_len to their default values. > >> * All other fields of the given packet mbuf will be left intact. > >> @@ -740,22 +775,28 @@ static inline void rte_pktmbuf_attach(struct > >> rte_mbuf *mi, struct rte_mbuf *md) > >> * @param m > >> * The indirect attached packet mbuf. > >> */ > >> - > >> static inline void rte_pktmbuf_detach(struct rte_mbuf *m) > >> { > >> - const struct rte_mempool *mp = m->pool; > >> - void *buf = RTE_MBUF_TO_BADDR(m); > >> - uint32_t buf_len = mp->elt_size - sizeof(*m); > >> - m->buf_physaddr = rte_mempool_virt2phy(mp, m) + sizeof (*m); > >> - > >> + struct rte_pktmbuf_pool_private *mbp_priv; > >> + struct rte_mempool *mp = m->pool; > >> + void *buf; > >> + unsigned mhdr_size; > >> + > >> + /* first, restore the priv_size, this is needed before calling > >> + * rte_mbuf_to_baddr() */ > >> + mbp_priv = rte_mempool_get_priv(mp); > >> + m->priv_size = mp->elt_size - RTE_PKTMBUF_HEADROOM - > >> + mbp_priv->mbuf_data_room_size - > >> + sizeof(struct rte_mbuf); > > > > I think it is better to put this priv_size calculation above into the > > separate function - > > rte_mbuf_get_priv_size(m) or something. > > We need it in few places, and users would probably need it anyway. > > yep, good idea > > > > >> + > >> + buf = rte_mbuf_to_baddr(m); > >> + mhdr_size = (char *)buf - (char *)m; > > > > Why do you need to recalculate mhdr_size here? > > As I understand it is a m->priv_size, and you just retrieved it, 2 lines > > above. > > > > It's not m->priv_size but (sizeof(rte_mbuf) + m->priv_size). Ah yes, sorry for confusion. > In both case, it requires an operation, but maybe > mhdr_size = (sizeof(rte_mbuf) + m->priv_size) > is clearer than > mhdr_size = (char *)buf - (char *)m > > > >> + m->buf_physaddr = rte_mempool_virt2phy(mp, m) + mhdr_size; > > > > Actually I think could just be: > > m->buf_physaddr = rte_mempool_virt2phy(mp, buf); > > Even if it would work, the API of rte_mempool_virt2phy() > says that the second argument should be "A pointer (virtual address) > to the element of the pool." > I think we should keep the initial code. Ok. Konstantin > > Regards, > Olivier >