> On Jul 13, 2018, at 12:10 PM, Burakov, Anatoly <anatoly.bura...@intel.com> 
> wrote:
> 
> On 06-Jul-18 2:17 PM, Anatoly Burakov wrote:
>> This is a proposal to enable using externally allocated memory
>> in DPDK.
>> In a nutshell, here is what is being done here:
>> - Index malloc heaps by NUMA node index, rather than NUMA node itself
>> - Add identifier string to malloc heap, to uniquely identify it
>> - Allow creating named heaps and add/remove memory to/from those heaps
>> - Allocate memseg lists at runtime, to keep track of IOVA addresses
>>   of externally allocated memory
>>   - If IOVA addresses aren't provided, use RTE_BAD_IOVA
>> - Allow malloc and memzones to allocate from named heaps
>> The responsibility to ensure memory is accessible before using it is
>> on the shoulders of the user - there is no checking done with regards
>> to validity of the memory (nor could there be...).
>> The following limitations are present:
>> - No multiprocess support
>> - No thread safety
>> There is currently no way to allocate memory during initialization
>> stage, so even if multiprocess support is added, it is not guaranteed
>> to work because of underlying issues with mapping fbarrays in
>> secondary processes. This is not an issue in single process scenario,
>> but it may be an issue in a multiprocess scenario in case where
>> primary doesn't intend to share the externally allocated memory, yet
>> adding such memory could fail because some other process failed to
>> attach to this shared memory when it wasn't needed.
>> Anatoly Burakov (11):
>>   mem: allow memseg lists to be marked as external
>>   eal: add function to rerieve socket index by socket ID
>>   malloc: index heaps using heap ID rather than NUMA node
>>   malloc: add name to malloc heaps
>>   malloc: enable retrieving statistics from named heaps
>>   malloc: enable allocating from named heaps
>>   malloc: enable creating new malloc heaps
>>   malloc: allow adding memory to named heaps
>>   malloc: allow removing memory from named heaps
>>   malloc: allow destroying heaps
>>   memzone: enable reserving memory from named heaps
>>  config/common_base                            |   1 +
>>  lib/librte_eal/common/eal_common_lcore.c      |  15 +
>>  lib/librte_eal/common/eal_common_memory.c     |  51 +++-
>>  lib/librte_eal/common/eal_common_memzone.c    | 283 ++++++++++++++----
>>  .../common/include/rte_eal_memconfig.h        |   5 +-
>>  lib/librte_eal/common/include/rte_lcore.h     |  19 +-
>>  lib/librte_eal/common/include/rte_malloc.h    | 158 +++++++++-
>>  .../common/include/rte_malloc_heap.h          |   2 +
>>  lib/librte_eal/common/include/rte_memzone.h   | 183 +++++++++++
>>  lib/librte_eal/common/malloc_heap.c           | 277 +++++++++++++++--
>>  lib/librte_eal/common/malloc_heap.h           |  26 ++
>>  lib/librte_eal/common/rte_malloc.c            | 197 +++++++++++-
>>  lib/librte_eal/rte_eal_version.map            |  10 +
>>  13 files changed, 1118 insertions(+), 109 deletions(-)
> 
> So, now that the RFC is out, i would like to ask a general question.
> 
> One other thing that this patchset is missing, is the ability for data 
> structures (e.g. hash, mempool, etc.) to be allocated from external heaps. 
> Currently, we can kinda sorta do that with various _init() API's 
> (initializing a data structure over already allocated memzone), but this is 
> not ideal and is a hassle for anyone using external memory in DPDK.
> 
> There are basically four ways to approach this problem (that i can see).
> 
> First way is to change "socket ID" to mean "heap ID" everywhere. This has an 
> upside of having a consistent API to allocate from internal and external 
> heaps, with little to no API additions, only internal changes to account for 
> the fact that "socket ID" is now "heap ID".
> 
> However, there is a massive downside to this approach: it is a *giant* API 
> change, and it's also a giant *ABI-compatible* API change. Meaning, replacing 
> socket ID with heap ID will not cause compile failures for old code, which 
> would result in many subtle bugs in already existing codebases. So, while in 
> the perfect world this would've been my preferred approach, realistically i 
> think this is a very, very bad idea.
> 
> Second one is to add a separate "heap name" API's to everything. This has an 
> upside of clean separation between allocation from internal and external 
> heaps. (well, whether it's an upside is debatable...) This is the approach i 
> expected to take when i was creating this patchset.
> 
> The downside is that we have to add new API's to every library and every DPDK 
> data structure, to allow explicit allocation from external heaps. We will 
> have to maintain both, and things like hardware drivers will need to have a 
> way to indicate the need to allocate things from a particular external heap.
> 
> The third way is to expose the "heap ID" externally, and allow a single, 
> unified API to reserve memory. That is, create an API that would map either a 
> NUMA node ID or a heap name to an ID, and allow reserving memory through that 
> ID regardless of whether it's internal or external memory. This would also 
> allow to gradually phase out socket-based ID's in favor of heap ID API, 
> should we choose to do so.
> 
> The downside for this is, it adds a layer of indirection between socket ID 
> and reserving memory on a particular NUMA node, and it makes it hard to 
> produce a single value of "heap ID" in such a way as to replicate current 
> functionality of allocating with SOCKET_ID_ANY. Most likely user will have to 
> explicitly try to allocate on all sockets, unless we keep old API's around in 
> parallel.
> 
> Finally, a fourth way would be to abuse the socket ID to also mean something 
> else, which is an approach i've seen numerous times already, and one that i 
> don't like. We could register new heaps as a new, fake socket ID, and use 
> that to address external heaps (each heap would get its own socket). So, keep 
> current socket ID behavior, but for non-existent sockets it would be possible 
> to be registered as a fake socket pointing to an external heap.
> 
> The upside for this approach would be that no API changes are required 
> whatsoever to existing libraries - this scheme is compatible with both 
> internal and external heaps without adding a separate API.
> 
> The downside is bad semantics - "special" sockets, handling of SOCKET_ID_ANY, 
> handling of "invalid socket" vs. "invalid socket that happens to correspond 
> to an existing external heap", and many other things that can be confusing. I 
> don't like this option, but it's an option :)
> 
> Thoughts? Comments?

#1 is super clean, but very disruptive to everyone. Very Bad IMO
#2 is also clean, but adds a lot of new APIs that everyone needs to use or at 
least in the external heap cases.
#3 not sure I fully understand it, but reproducing heap IDs for testing is a 
problem and requires new/old APIs

#4 Very easy to add, IMO it is clean and very small disruption to developers. 
It does require the special handling, but I feel it is OK and can be explained 
in the docs. Having a socket id as an ‘int’ gives us a lot room e.g. id < 64K 
is normal socket and > 64K is external id.

My vote would be #4, as it seems the least risk and work. :-)

> 
> I myself still favor the second way, however there are good arguments to be 
> made for each of these options.
> 
> -- 
> Thanks,
> Anatoly

Regards,
Keith

Reply via email to