On Tue, Oct 29, 2019 at 01:59:00PM +0300, Andrew Rybchenko wrote: > On 10/28/19 5:01 PM, Olivier Matz wrote: > > When populating a mempool, ensure that objects are not located across > > several pages, except if user did not request iova contiguous objects. > > I think it breaks distribution across memory channels which could > affect performance significantly.
With 2M hugepages, there are ~900 mbufs per page, and all of them will be distributed across memory channels. For larger objects, I don't think the distribution is that important. With small pages, that may be true. I think the problem was already there except in IOVA=VA mode. This should be fixable, but I'm not sure a use-case where we can see a regression really exists. > > Signed-off-by: Vamsi Krishna Attunuru <vattun...@marvell.com> > > Signed-off-by: Olivier Matz <olivier.m...@6wind.com> > > --- > > lib/librte_mempool/rte_mempool.c | 23 +++++----------- > > lib/librte_mempool/rte_mempool_ops_default.c | 29 ++++++++++++++++++-- > > 2 files changed, 33 insertions(+), 19 deletions(-) > > > > diff --git a/lib/librte_mempool/rte_mempool.c > > b/lib/librte_mempool/rte_mempool.c > > index 7664764e5..b23fd1b06 100644 > > --- a/lib/librte_mempool/rte_mempool.c > > +++ b/lib/librte_mempool/rte_mempool.c > > @@ -428,8 +428,6 @@ rte_mempool_get_page_size(struct rte_mempool *mp, > > size_t *pg_sz) > > if (!need_iova_contig_obj) > > *pg_sz = 0; > > - else if (!alloc_in_ext_mem && rte_eal_iova_mode() == RTE_IOVA_VA) > > - *pg_sz = 0; > > else if (rte_eal_has_hugepages() || alloc_in_ext_mem) > > *pg_sz = get_min_page_size(mp->socket_id); > > else > > @@ -478,17 +476,15 @@ rte_mempool_populate_default(struct rte_mempool *mp) > > * then just set page shift and page size to 0, because the user has > > * indicated that there's no need to care about anything. > > * > > - * if we do need contiguous objects, there is also an option to reserve > > - * the entire mempool memory as one contiguous block of memory, in > > - * which case the page shift and alignment wouldn't matter as well. > > + * if we do need contiguous objects (if a mempool driver has its > > + * own calc_size() method returning min_chunk_size = mem_size), > > + * there is also an option to reserve the entire mempool memory > > + * as one contiguous block of memory. > > * > > * if we require contiguous objects, but not necessarily the entire > > - * mempool reserved space to be contiguous, then there are two options. > > - * > > - * if our IO addresses are virtual, not actual physical (IOVA as VA > > - * case), then no page shift needed - our memory allocation will give us > > - * contiguous IO memory as far as the hardware is concerned, so > > - * act as if we're getting contiguous memory. > > + * mempool reserved space to be contiguous, pg_sz will be != 0, > > + * and the default ops->populate() will take care of not placing > > + * objects across pages. > > * > > * if our IO addresses are physical, we may get memory from bigger > > * pages, or we might get memory from smaller pages, and how much of it > > @@ -501,11 +497,6 @@ rte_mempool_populate_default(struct rte_mempool *mp) > > * > > * If we fail to get enough contiguous memory, then we'll go and > > * reserve space in smaller chunks. > > - * > > - * We also have to take into account the fact that memory that we're > > - * going to allocate from can belong to an externally allocated memory > > - * area, in which case the assumption of IOVA as VA mode being > > - * synonymous with IOVA contiguousness will not hold. > > */ > > need_iova_contig_obj = !(mp->flags & MEMPOOL_F_NO_IOVA_CONTIG); > > diff --git a/lib/librte_mempool/rte_mempool_ops_default.c > > b/lib/librte_mempool/rte_mempool_ops_default.c > > index f6aea7662..dd09a0a32 100644 > > --- a/lib/librte_mempool/rte_mempool_ops_default.c > > +++ b/lib/librte_mempool/rte_mempool_ops_default.c > > @@ -61,21 +61,44 @@ rte_mempool_op_calc_mem_size_default(const struct > > rte_mempool *mp, > > return mem_size; > > } > > +/* Returns -1 if object crosses a page boundary, else returns 0 */ > > +static int > > +check_obj_bounds(char *obj, size_t pg_sz, size_t elt_sz) > > +{ > > + if (pg_sz == 0) > > + return 0; > > + if (elt_sz > pg_sz) > > + return 0; > > + if (RTE_PTR_ALIGN(obj, pg_sz) != RTE_PTR_ALIGN(obj + elt_sz - 1, pg_sz)) > > + return -1; > > + return 0; > > +} > > + > > int > > rte_mempool_op_populate_default(struct rte_mempool *mp, unsigned int > > max_objs, > > void *vaddr, rte_iova_t iova, size_t len, > > rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) > > { > > - size_t total_elt_sz; > > + char *va = vaddr; > > + size_t total_elt_sz, pg_sz; > > size_t off; > > unsigned int i; > > void *obj; > > + rte_mempool_get_page_size(mp, &pg_sz); > > + > > The function may return an error which should be taken into account here. That would be better, indeed. > > total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; > > - for (off = 0, i = 0; off + total_elt_sz <= len && i < max_objs; i++) { > > + for (off = 0, i = 0; i < max_objs; i++) { > > + /* align offset to next page start if required */ > > + if (check_obj_bounds(va + off, pg_sz, total_elt_sz) < 0) > > + off += RTE_PTR_ALIGN_CEIL(va + off, pg_sz) - (va + off); > > + > > + if (off + total_elt_sz > len) > > + break; > > + > > off += mp->header_size; > > - obj = (char *)vaddr + off; > > + obj = va + off; > > obj_cb(mp, obj_cb_arg, obj, > > (iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off)); > > rte_mempool_ops_enqueue_bulk(mp, &obj, 1); > Thanks for all the comments! I will send a v2 tomorrow. Olivier