On Wed, Apr 25, 2018 at 01:31:42PM +0000, Ananyev, Konstantin wrote: [...] > > /** Mbuf prefetch */ > > #define RTE_MBUF_PREFETCH_TO_FREE(m) do { \ > > if ((m) != NULL) \ > > @@ -1213,11 +1306,127 @@ static inline int rte_pktmbuf_alloc_bulk(struct > > rte_mempool *pool, > > } > > > > /** > > + * Attach an external buffer to a mbuf. > > + * > > + * User-managed anonymous buffer can be attached to an mbuf. When attaching > > + * it, corresponding free callback function and its argument should be > > + * provided. This callback function will be called once all the mbufs are > > + * detached from the buffer. > > + * > > + * The headroom for the attaching mbuf will be set to zero and this can be > > + * properly adjusted after attachment. For example, ``rte_pktmbuf_adj()`` > > + * or ``rte_pktmbuf_reset_headroom()`` can be used. > > + * > > + * More mbufs can be attached to the same external buffer by > > + * ``rte_pktmbuf_attach()`` once the external buffer has been attached by > > + * this API. > > + * > > + * Detachment can be done by either ``rte_pktmbuf_detach_extbuf()`` or > > + * ``rte_pktmbuf_detach()``. > > + * > > + * Attaching an external buffer is quite similar to mbuf indirection in > > + * replacing buffer addresses and length of a mbuf, but a few differences: > > + * - When an indirect mbuf is attached, refcnt of the direct mbuf would be > > + * 2 as long as the direct mbuf itself isn't freed after the attachment. > > + * In such cases, the buffer area of a direct mbuf must be read-only. But > > + * external buffer has its own refcnt and it starts from 1. Unless > > + * multiple mbufs are attached to a mbuf having an external buffer, the > > + * external buffer is writable. > > + * - There's no need to allocate buffer from a mempool. Any buffer can be > > + * attached with appropriate free callback and its IO address. > > + * - Smaller metadata is required to maintain shared data such as refcnt. > > + * > > + * @warning > > + * @b EXPERIMENTAL: This API may change without prior notice. > > + * Once external buffer is enabled by allowing experimental API, > > + * ``RTE_MBUF_DIRECT()`` and ``RTE_MBUF_INDIRECT()`` are no longer > > + * exclusive. A mbuf can be considered direct if it is neither indirect nor > > + * having external buffer. > > + * > > + * @param m > > + * The pointer to the mbuf. > > + * @param buf_addr > > + * The pointer to the external buffer we're attaching to. > > + * @param buf_iova > > + * IO address of the external buffer we're attaching to. > > + * @param buf_len > > + * The size of the external buffer we're attaching to. If memory for > > + * shared data is not provided, buf_len must be larger than the size of > > + * ``struct rte_mbuf_ext_shared_info`` and padding for alignment. If not > > + * enough, this function will return NULL. > > + * @param shinfo > > + * User-provided memory for shared data. If NULL, a few bytes in the > > + * trailer of the provided buffer will be dedicated for shared data and > > + * the shared data will be properly initialized. Otherwise, user must > > + * initialize the content except for free callback and its argument. The > > + * pointer of shared data will be stored in m->shinfo. > > + * @param free_cb > > + * Free callback function to call when the external buffer needs to be > > + * freed. > > + * @param fcb_opaque > > + * Argument for the free callback function. > > + * > > + * @return > > + * A pointer to the new start of the data on success, return NULL > > + * otherwise. > > + */ > > +static inline char * __rte_experimental > > +rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr, > > + rte_iova_t buf_iova, uint16_t buf_len, > > + struct rte_mbuf_ext_shared_info *shinfo, > > + rte_mbuf_extbuf_free_callback_t free_cb, void *fcb_opaque) > > +{ > > + /* Additional attachment should be done by rte_pktmbuf_attach() */ > > + RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m)); > > Shouldn't we have here something like: > RTE_ASSERT(RTE_MBUF_DIRECT(m) && rte_mbuf_refcnt_read(m) == 1); > ?
Right. That's better. Attaching mbuf should be direct and writable. > > + > > + m->buf_addr = buf_addr; > > + m->buf_iova = buf_iova; > > + > > + if (shinfo == NULL) { > > Instead of allocating shinfo ourselves - wound's it be better to rely > on caller always allocating afeeling it for us (he can do that at the > end/start of buffer, > or whenever he likes to. It is just for convenience. For some users, external attachment could be occasional and casual, e.g. punt control traffic from kernel/hv. For such non-serious cases, it is good to provide this small utility. > Again in that case - caller can provide one shinfo to several mbufs (with > different buf_addrs) > and would know for sure that free_cb wouldn't be overwritten by mistake. > I.E. mbuf code will only update refcnt inside shinfo. I think you missed the discussion with other people yesterday. This change is exactly for that purpose. Like I documented above, if this API is called with shinfo being provided, it will use the user-provided shinfo instead of sparing a few byte in the trailer and won't touch the shinfo. This code block happens only if user doesn't provide memory for shared data (shinfo is NULL). > > + void *buf_end = RTE_PTR_ADD(buf_addr, buf_len); > > + > > + shinfo = RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end, > > + sizeof(*shinfo)), sizeof(uintptr_t)); > > + if ((void *)shinfo <= buf_addr) > > + return NULL; > > + > > + m->buf_len = RTE_PTR_DIFF(shinfo, buf_addr); > > + rte_mbuf_ext_refcnt_set(shinfo, 1); > > + } else { > > + m->buf_len = buf_len; > > I think you need to update shinfo>refcnt here too. Like explained above, if shinfo is provided, it doesn't alter anything except for callbacks and its arg. Thanks, Yongseok