Hi, I'm not sure if this is a bug, but I've seen an inconsistency in the behavior of DPDK with regards to hugepage allocation for rte_mempool. Basically, for the same mempool size, the number of hugepages allocated changes from run to run.
Here's how I reproduce with DPDK 19.11. IOVA=pa (default) 1. Reserve 16x1G hugepages on socket 0 2. Replace examples/skeleton/basicfwd.c with the code below, build and run make && ./build/basicfwd 3. At the same time, watch the number of hugepages allocated "watch -n.1 ls /dev/hugepages" 4. Repeat step 2 If you can reproduce, you should see that for some runs, DPDK allocates 5 hugepages, other times it allocates 6. When it allocates 6, if you watch the output from step 3., you will see that DPDK first try to allocate 5 hugepages, then unmap all 5, retry, and got 6. For our use case, it's important that DPDK allocate the same number of hugepages on every run so we can get reproducable results. Studying the code, this seems to be the behavior of rte_mempool_populate_default(). If I understand correctly, if the first try fail to get 5 IOVA-contiguous pages, it retries, relaxing the IOVA-contiguous condition, and eventually wound up with 6 hugepages. Questions: 1. Why does the API sometimes fail to get IOVA contig mem, when hugepage memory is abundant? 2. Why does the 2nd retry need N+1 hugepages? Some insights for Q1: From my experiments, seems like the IOVA of the first hugepage is not guaranteed to be at the start of the IOVA space (understandably). It could explain the retry when the IOVA of the first hugepage is near the end of the IOVA space. But I have also seen situation where the 1st hugepage is near the beginning of the IOVA space and it still failed the 1st time. Here's the code: #include <rte_eal.h> #include <rte_mbuf.h> int main(int argc, char *argv[]) { struct rte_mempool *mbuf_pool; unsigned mbuf_pool_size = 2097151; int ret = rte_eal_init(argc, argv); if (ret < 0) rte_exit(EXIT_FAILURE, "Error with EAL initialization\n"); printf("Creating mbuf pool size=%u\n", mbuf_pool_size); mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", mbuf_pool_size, 256, 0, RTE_MBUF_DEFAULT_BUF_SIZE, 0); printf("mbuf_pool %p\n", mbuf_pool); return 0; } Best regards, BL