Hi David, Please find some comments below.
On 05/19/2016 04:48 PM, David Hunt wrote: > This is a mempool handler that is useful for pipelining apps, where > the mempool cache doesn't really work - example, where we have one > core doing rx (and alloc), and another core doing Tx (and return). In > such a case, the mempool ring simply cycles through all the mbufs, > resulting in a LLC miss on every mbuf allocated when the number of > mbufs is large. A stack recycles buffers more effectively in this > case. > > v2: cleanup based on mailing list comments. Mainly removal of > unnecessary casts and comments. > > Signed-off-by: David Hunt <david.hunt at intel.com> > --- > lib/librte_mempool/Makefile | 1 + > lib/librte_mempool/rte_mempool_stack.c | 145 > +++++++++++++++++++++++++++++++++ > 2 files changed, 146 insertions(+) > create mode 100644 lib/librte_mempool/rte_mempool_stack.c > > diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile > index f19366e..5aa9ef8 100644 > --- a/lib/librte_mempool/Makefile > +++ b/lib/librte_mempool/Makefile > @@ -44,6 +44,7 @@ LIBABIVER := 2 > SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) += rte_mempool.c > SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) += rte_mempool_handler.c > SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) += rte_mempool_default.c > +SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) += rte_mempool_stack.c > # install includes > SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h > > diff --git a/lib/librte_mempool/rte_mempool_stack.c > b/lib/librte_mempool/rte_mempool_stack.c > new file mode 100644 > index 0000000..6e25028 > --- /dev/null > +++ b/lib/librte_mempool/rte_mempool_stack.c > @@ -0,0 +1,145 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > + * All rights reserved. Should be 2016? > ... > + > +static void * > +common_stack_alloc(struct rte_mempool *mp) > +{ > + struct rte_mempool_common_stack *s; > + unsigned n = mp->size; > + int size = sizeof(*s) + (n+16)*sizeof(void *); > + > + /* Allocate our local memory structure */ > + s = rte_zmalloc_socket("common-stack", "mempool-stack" ? > + size, > + RTE_CACHE_LINE_SIZE, > + mp->socket_id); > + if (s == NULL) { > + RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n"); > + return NULL; > + } > + > + rte_spinlock_init(&s->sl); > + > + s->size = n; > + mp->pool = s; > + rte_mempool_set_handler(mp, "stack"); rte_mempool_set_handler() is a user function, it should be called here > + > + return s; > +} > + > +static int common_stack_put(void *p, void * const *obj_table, > + unsigned n) > +{ > + struct rte_mempool_common_stack *s = p; > + void **cache_objs; > + unsigned index; > + > + rte_spinlock_lock(&s->sl); > + cache_objs = &s->objs[s->len]; > + > + /* Is there sufficient space in the stack ? */ > + if ((s->len + n) > s->size) { > + rte_spinlock_unlock(&s->sl); > + return -ENOENT; > + } The usual return value for a failing put() is ENOBUFS (see in rte_ring). After reading it, I realize that it's nearly exactly the same code than in "app/test: test external mempool handler". http://patchwork.dpdk.org/dev/patchwork/patch/12896/ We should drop one of them. If this stack handler is really useful for a performance use-case, it could go in librte_mempool. At the first read, the code looks like a demo example : it uses a simple spinlock for concurrent accesses to the common pool. Maybe the mempool cache hides this cost, in this case we could also consider removing the use of the rte_ring. Do you have some some performance numbers? Do you know if it scales with the number of cores? If we can identify the conditions where this mempool handler overperforms the default handler, it would be valuable to have them in the documentation. Regards, Olivier