On 15/05/2024 23:34, Morten Brørup wrote: >> From: Paul Szczepanek [mailto:paul.szczepa...@arm.com] >> >> AFAIK DPDK rte_mempool does require the addresses to be virtually >> contiguous as the memory reservation is done during creation of the >> mempool and a single memzone is reserved. > > No, it does not. > rte_pktmbuf_pool_create() first creates an empty mempool using > rte_mempool_create_empty(), and then populates it using > rte_mempool_populate_default(), which may use multiple memzones. > > As one possible solution to this, the application can call > rte_pktmbuf_pool_create() as usual, and then check that mp->nb_mem_chunks == > 1 to confirm that all objects in the created mempool reside in one contiguous > chunk of memory and thus compression can be used. > > Or even better, add a new mempool flag to the mempool library, e.g. > RTE_MEMPOOL_F_COMPRESSIBLE, specifying that the mempool objects must be > allocated as one chunk of memory with contiguous addresses. > Unfortunately, rte_pktmbuf_pool_create() is missing the memzone flags > parameter, so a new rte_pktmbuf_pool_create() API with the flags parameter > added would also need to be added. >
You're right, my misunderstanding stemmed from only one mz being stored in the rte_mempool struct, but nb_mem_chunks is in fact the variable to check in mempool to verify contiguous VAs. I'll look into the possibility of adding a mempool contiguous option to mempool. > > For future proofing, please rename the compression functions to include the > compression algorithm, i.e. "shift" or similar, in the function names. > > Specifically I'm thinking about an alternative "multiply" compression > algorithm based on division/multiplication by a constant "multiplier" > parameter (instead of the "bit_shift" parameter). > This "multiplier" would typically be the object size of the packet mbuf > mempool. > The "multiplier" could be constant at built time, e.g. 2368, or determined at > runtime. > I don't know the performance of division/multiplication compared to bit > shifting for various CPUs, but it would make compression to 16 bit compressed > pointers viable for more applications. > > The perf test in this series could be used to determine > compression/decompression performance of such an algorithm, and the > application developer can determine which algorithm to use; "shift" with 32 > bit compressed pointers, or "multiply" with 16 bit compressed pointers. > Will add shift to name.