Re: [dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

Burakov, Anatoly Wed, 25 Apr 2018 09:03:40 -0700

On 14-Feb-18 10:07 AM, Burakov, Anatoly wrote:

On 14-Feb-18 8:04 AM, Thomas Monjalon wrote:
Hi Anatoly,
19/12/2017 12:14, Anatoly Burakov:
* Memory tagging. This is related to previous item. Right now, wecan only ask malloc to allocate memory by page size, but one could potentiallyhave different memory regions backed by pages of similar sizes (forexample, locked 1G pages, to completely avoid TLB misses, alongsideregular 1G pages), and it would be good to have that kind of mechanism todistinguish between different memory types available to a DPDK application. Onecould, for example,
    tag memory by "purpose" (i.e. "fast", "slow"), or in other ways.
How do you imagine memory tagging?
Should it be a parameter when requesting some memory from rte_malloc
or rte_mempool?
We can't make it a parameter for mempool without making it a parameterfor rte_malloc, as every memory allocation in DPDK works throughrte_malloc. So at the very least, rte_malloc will have it. And as longas rte_malloc has it, there's no reason why memzones and mempoolscouldn't - not much code to add.
Could it be a bit-field allowing to combine some properties?
Does it make sense to have "DMA" as one of the purpose?
Something like a bitfield would be my preference, yes. That way we couldclassify memory in certain ways and allocate based on that. Which"certain ways" these are, i'm not sure. For example, in addition totagging memory as "DMA-capable" (which i think is a given), one mighttag certain memory as "non-default", as in, never allocate from thischunk of memory unless explicitly asked to do so - this could be usefulfor types of memory that are a precious resource.
Then again, it is likely that we won't have many types of memory inDPDK, and any other type would be implementation-specific, so maybe juststringly-typing it is OK (maybe we can finally make use of "type"parameter in rte_malloc!).
How to transparently allocate the best memory for the NIC?
You take care of the NUMA socket property, but there can be more
requirements, like getting memory from the NIC itself.
I would think that we can't make it generic enough to cover all cases,so it's best to expose some API's and let PMD's handle this themselves.
+Cc more people (6WIND, Cavium, Chelsio, Mellanox, Netronome, NXP,Solarflare)
in order to trigger a discussion about the ideal requirements.


Hi all,

I would like to restart this discussion, again :) I would like to hearsome feedback on my thoughts below.

I've had some more thinking about it, and while i have lots of use-casesin mind, i suspect covering them all while keeping a sane API isunrealistic.


So, first things first.

Main issue we have is the 1:1 correspondence of malloc heap, and socketID. This has led to various attempts to hijack socket id's to dosomething else - i've seen this approach a few times before, mostrecently in a patch by Srinath/Broadcom [1]. We need to break thisdependency somehow, and have a unique heap identifier.

Also, since memory allocators are expected to behave roughly similar todrivers (e.g. have a driver API and provide hooks for init/alloc/freefunctions, etc.), a request to allocate memory may not just go to theheap itself (which is handled internally by rte_malloc), but also go toits respective allocator. This is roughly similar to what is happeningcurrently, except that which allocator functions to call will thendepend on which driver allocated that heap.

So, we arrive at a dependency - heap => allocator. Each heap must knowto which allocator it belongs - so, we also need some kind of way toidentify not just the heap, but the allocator as well.

In the above quotes from previous mails i suggested categorizing memoryby "types", but now that i think of it, the API would've been toocomplex, as we would've ideally had to cover use cases such as "allocatememory of this type, no matter from which allocator it comes from","allocate memory from this particular heap", "allocate memory from thisparticular allocator"... It gets complicated pretty fast.

What i propose instead, is this. In 99% of time, user wants our hugepageallocator. So, by default, all allocations will come through that. Inthe event that user needs memory from a specific heap, we need toprovide a new set of API's to request memory from a specific heap.

Do we expect situations where user might *not* want default allocator,but also *not* know which exact heap he wants? If the answer is no(which i'm counting on :) ), then allocating from a specific mallocdriver becomes as simple as something like this:


mem = rte_malloc_from_heap("my_very_special_heap");

(stringly-typed heap ID is just an example)

So, old API's remain intact, and are always passed through to a defaultallocator, while new API's will grant access to other allocators.

Heap ID alone, however, may not provide enough flexibility. For example,if a malloc driver allocates a specific kind of memory that isNUMA-aware, it would perhaps be awkward to call different heap ID's whenthe memory being allocated is arguably the same, just subdivided intoseveral blocks. Moreover, figuring out situations like this would likelyrequire some cooperation from the allocator itself (possibly someallocator-specific API's), but should we add malloc heap arguments,those would have to be generic. I'm not sure if we want to go that far,though.


Does that sound reasonable?

Another tangentially related issue raised by Olivier [1] is ofallocating memory in blocks, rather than using rte_malloc. Currentimplementation has rte_malloc storing its metadata right in the memory -this leads to unnecessary memory fragmentation in certain cases, such asallocating memory page-by-page, and in general polluting memory we mightnot want to pollute with malloc metadata.

To fix this, memory allocator would have to store malloc dataexternally, which comes with a few caveats (reverse mapping of pointersto malloc elements, storing, looking up and accounting for saidelements, etc.). It's not currently planned to work on it, but it'scertainly something to think about :)


[1] http://dpdk.org/dev/patchwork/patch/36596/
[2] http://dpdk.org/ml/archives/dev/2018-March/093212.html

--
Thanks,
Anatoly

Re: [dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

Reply via email to