On Wed, Oct 30, 2019 at 10:46:31AM +0300, Andrew Rybchenko wrote: > On 10/29/19 8:25 PM, Vamsi Krishna Attunuru wrote: > > Hi Olivier, > > > > > -----Original Message----- > > > From: Olivier Matz <olivier.m...@6wind.com> > > > Sent: Monday, October 28, 2019 7:31 PM > > > To: dev@dpdk.org > > > Cc: Anatoly Burakov <anatoly.bura...@intel.com>; Andrew Rybchenko > > > <arybche...@solarflare.com>; Ferruh Yigit <ferruh.yi...@linux.intel.com>; > > > Giridharan, Ganesan <ggiridha...@rbbn.com>; Jerin Jacob Kollanukkaran > > > <jer...@marvell.com>; Kiran Kumar Kokkilagadda > > > <kirankum...@marvell.com>; Stephen Hemminger > > > <sthem...@microsoft.com>; Thomas Monjalon <tho...@monjalon.net>; > > > Vamsi Krishna Attunuru <vattun...@marvell.com> > > > Subject: [EXT] [PATCH 5/5] mempool: prevent objects from being across > > > pages > > > > > > External Email > > > > > > ---------------------------------------------------------------------- > > > When populating a mempool, ensure that objects are not located across > > > several pages, except if user did not request iova contiguous objects. > > > > > > Signed-off-by: Vamsi Krishna Attunuru <vattun...@marvell.com> > > > Signed-off-by: Olivier Matz <olivier.m...@6wind.com> > > > --- > > > lib/librte_mempool/rte_mempool.c | 23 +++++----------- > > > lib/librte_mempool/rte_mempool_ops_default.c | 29 ++++++++++++++++++- > > > - > > > 2 files changed, 33 insertions(+), 19 deletions(-) > > > > > > diff --git a/lib/librte_mempool/rte_mempool.c > > > b/lib/librte_mempool/rte_mempool.c > > > index 7664764e5..b23fd1b06 100644 > > > --- a/lib/librte_mempool/rte_mempool.c > > > +++ b/lib/librte_mempool/rte_mempool.c > > > @@ -428,8 +428,6 @@ rte_mempool_get_page_size(struct rte_mempool > > > *mp, size_t *pg_sz) > > > > > > if (!need_iova_contig_obj) > > > *pg_sz = 0; > > > - else if (!alloc_in_ext_mem && rte_eal_iova_mode() == > > > RTE_IOVA_VA) > > > - *pg_sz = 0; > > > else if (rte_eal_has_hugepages() || alloc_in_ext_mem) > > > *pg_sz = get_min_page_size(mp->socket_id); > > > else > > > @@ -478,17 +476,15 @@ rte_mempool_populate_default(struct > > > rte_mempool *mp) > > > * then just set page shift and page size to 0, because the > > > user has > > > * indicated that there's no need to care about anything. > > > * > > > - * if we do need contiguous objects, there is also an option to reserve > > > - * the entire mempool memory as one contiguous block of memory, > > > in > > > - * which case the page shift and alignment wouldn't matter as well. > > > + * if we do need contiguous objects (if a mempool driver has its > > > + * own calc_size() method returning min_chunk_size = mem_size), > > > + * there is also an option to reserve the entire mempool memory > > > + * as one contiguous block of memory. > > > * > > > * if we require contiguous objects, but not necessarily the > > > entire > > > - * mempool reserved space to be contiguous, then there are two > > > options. > > > - * > > > - * if our IO addresses are virtual, not actual physical (IOVA as VA > > > - * case), then no page shift needed - our memory allocation will give > > > us > > > - * contiguous IO memory as far as the hardware is concerned, so > > > - * act as if we're getting contiguous memory. > > > + * mempool reserved space to be contiguous, pg_sz will be != 0, > > > + * and the default ops->populate() will take care of not placing > > > + * objects across pages. > > > * > > > * if our IO addresses are physical, we may get memory from > > > bigger > > > * pages, or we might get memory from smaller pages, and how > > > much of it @@ -501,11 +497,6 @@ rte_mempool_populate_default(struct > > > rte_mempool *mp) > > > * > > > * If we fail to get enough contiguous memory, then we'll go and > > > * reserve space in smaller chunks. > > > - * > > > - * We also have to take into account the fact that memory that we're > > > - * going to allocate from can belong to an externally allocated > > > memory > > > - * area, in which case the assumption of IOVA as VA mode being > > > - * synonymous with IOVA contiguousness will not hold. > > > */ > > > > > > need_iova_contig_obj = !(mp->flags & > > > MEMPOOL_F_NO_IOVA_CONTIG); diff --git > > > a/lib/librte_mempool/rte_mempool_ops_default.c > > > b/lib/librte_mempool/rte_mempool_ops_default.c > > > index f6aea7662..dd09a0a32 100644 > > > --- a/lib/librte_mempool/rte_mempool_ops_default.c > > > +++ b/lib/librte_mempool/rte_mempool_ops_default.c > > > @@ -61,21 +61,44 @@ rte_mempool_op_calc_mem_size_default(const > > > struct rte_mempool *mp, > > > return mem_size; > > > } > > > > > > +/* Returns -1 if object crosses a page boundary, else returns 0 */ > > > +static int check_obj_bounds(char *obj, size_t pg_sz, size_t elt_sz) { > > > + if (pg_sz == 0) > > > + return 0; > > > + if (elt_sz > pg_sz) > > > + return 0; > > > + if (RTE_PTR_ALIGN(obj, pg_sz) != RTE_PTR_ALIGN(obj + elt_sz - 1, > > > pg_sz)) > > > + return -1; > > > + return 0; > > > +} > > > + > > > int > > > rte_mempool_op_populate_default(struct rte_mempool *mp, unsigned int > > > max_objs, > > > void *vaddr, rte_iova_t iova, size_t len, > > > rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) > > > { > > > - size_t total_elt_sz; > > > + char *va = vaddr; > > > + size_t total_elt_sz, pg_sz; > > > size_t off; > > > unsigned int i; > > > void *obj; > > > > > > + rte_mempool_get_page_size(mp, &pg_sz); > > > + > > > total_elt_sz = mp->header_size + mp->elt_size + > > > mp->trailer_size; > > > > > > - for (off = 0, i = 0; off + total_elt_sz <= len && i < max_objs; i++) { > > > + for (off = 0, i = 0; i < max_objs; i++) { > > > + /* align offset to next page start if required */ > > > + if (check_obj_bounds(va + off, pg_sz, total_elt_sz) < 0) > > > + off += RTE_PTR_ALIGN_CEIL(va + off, pg_sz) - (va + > > > off); > > Moving offset to the start of next page and than freeing (vaddr + off + > > header_size) to pool, this scheme is not aligning with octeontx2 mempool's > > buf alignment requirement(buffer address needs to be multiple of buffer > > size). > > It sounds line octeontx2 mempool should have its own populate callback > which cares about it. >
Agree, even the calc_mem_size() should be modified to be rigorous. Let me draft something in next version, so you can test it.