On 15/05/2024 23:34, Morten Brørup wrote:
>> From: Paul Szczepanek [mailto:paul.szczepa...@arm.com]
>>
>> AFAIK DPDK rte_mempool does require the addresses to be virtually
>> contiguous as the memory reservation is done during creation of the
>> mempool and a single memzone is reserved.
> 
> No, it does not.
> rte_pktmbuf_pool_create() first creates an empty mempool using 
> rte_mempool_create_empty(), and then populates it using 
> rte_mempool_populate_default(), which may use multiple memzones.
> 
> As one possible solution to this, the application can call 
> rte_pktmbuf_pool_create() as usual, and then check that mp->nb_mem_chunks == 
> 1 to confirm that all objects in the created mempool reside in one contiguous 
> chunk of memory and thus compression can be used.
> 
> Or even better, add a new mempool flag to the mempool library, e.g. 
> RTE_MEMPOOL_F_COMPRESSIBLE, specifying that the mempool objects must be 
> allocated as one chunk of memory with contiguous addresses.
> Unfortunately, rte_pktmbuf_pool_create() is missing the memzone flags 
> parameter, so a new rte_pktmbuf_pool_create() API with the flags parameter 
> added would also need to be added.
> 

You're right, my misunderstanding stemmed from only one mz being stored
in the rte_mempool struct, but nb_mem_chunks is in fact the variable to
check in mempool to verify contiguous VAs. I'll look into the
possibility of adding a mempool contiguous option to mempool.

> 
> For future proofing, please rename the compression functions to include the 
> compression algorithm, i.e. "shift" or similar, in the function names.
> 
> Specifically I'm thinking about an alternative "multiply" compression 
> algorithm based on division/multiplication by a constant "multiplier" 
> parameter (instead of the "bit_shift" parameter).
> This "multiplier" would typically be the object size of the packet mbuf 
> mempool.
> The "multiplier" could be constant at built time, e.g. 2368, or determined at 
> runtime.
> I don't know the performance of division/multiplication compared to bit 
> shifting for various CPUs, but it would make compression to 16 bit compressed 
> pointers viable for more applications.
> 
> The perf test in this series could be used to determine 
> compression/decompression performance of such an algorithm, and the 
> application developer can determine which algorithm to use; "shift" with 32 
> bit compressed pointers, or "multiply" with 16 bit compressed pointers.
> 

Will add shift to name.

Reply via email to