> On Jul 13, 2018, at 12:10 PM, Burakov, Anatoly <anatoly.bura...@intel.com> > wrote: > > On 06-Jul-18 2:17 PM, Anatoly Burakov wrote: >> This is a proposal to enable using externally allocated memory >> in DPDK. >> In a nutshell, here is what is being done here: >> - Index malloc heaps by NUMA node index, rather than NUMA node itself >> - Add identifier string to malloc heap, to uniquely identify it >> - Allow creating named heaps and add/remove memory to/from those heaps >> - Allocate memseg lists at runtime, to keep track of IOVA addresses >> of externally allocated memory >> - If IOVA addresses aren't provided, use RTE_BAD_IOVA >> - Allow malloc and memzones to allocate from named heaps >> The responsibility to ensure memory is accessible before using it is >> on the shoulders of the user - there is no checking done with regards >> to validity of the memory (nor could there be...). >> The following limitations are present: >> - No multiprocess support >> - No thread safety >> There is currently no way to allocate memory during initialization >> stage, so even if multiprocess support is added, it is not guaranteed >> to work because of underlying issues with mapping fbarrays in >> secondary processes. This is not an issue in single process scenario, >> but it may be an issue in a multiprocess scenario in case where >> primary doesn't intend to share the externally allocated memory, yet >> adding such memory could fail because some other process failed to >> attach to this shared memory when it wasn't needed. >> Anatoly Burakov (11): >> mem: allow memseg lists to be marked as external >> eal: add function to rerieve socket index by socket ID >> malloc: index heaps using heap ID rather than NUMA node >> malloc: add name to malloc heaps >> malloc: enable retrieving statistics from named heaps >> malloc: enable allocating from named heaps >> malloc: enable creating new malloc heaps >> malloc: allow adding memory to named heaps >> malloc: allow removing memory from named heaps >> malloc: allow destroying heaps >> memzone: enable reserving memory from named heaps >> config/common_base | 1 + >> lib/librte_eal/common/eal_common_lcore.c | 15 + >> lib/librte_eal/common/eal_common_memory.c | 51 +++- >> lib/librte_eal/common/eal_common_memzone.c | 283 ++++++++++++++---- >> .../common/include/rte_eal_memconfig.h | 5 +- >> lib/librte_eal/common/include/rte_lcore.h | 19 +- >> lib/librte_eal/common/include/rte_malloc.h | 158 +++++++++- >> .../common/include/rte_malloc_heap.h | 2 + >> lib/librte_eal/common/include/rte_memzone.h | 183 +++++++++++ >> lib/librte_eal/common/malloc_heap.c | 277 +++++++++++++++-- >> lib/librte_eal/common/malloc_heap.h | 26 ++ >> lib/librte_eal/common/rte_malloc.c | 197 +++++++++++- >> lib/librte_eal/rte_eal_version.map | 10 + >> 13 files changed, 1118 insertions(+), 109 deletions(-) > > So, now that the RFC is out, i would like to ask a general question. > > One other thing that this patchset is missing, is the ability for data > structures (e.g. hash, mempool, etc.) to be allocated from external heaps. > Currently, we can kinda sorta do that with various _init() API's > (initializing a data structure over already allocated memzone), but this is > not ideal and is a hassle for anyone using external memory in DPDK. > > There are basically four ways to approach this problem (that i can see). > > First way is to change "socket ID" to mean "heap ID" everywhere. This has an > upside of having a consistent API to allocate from internal and external > heaps, with little to no API additions, only internal changes to account for > the fact that "socket ID" is now "heap ID". > > However, there is a massive downside to this approach: it is a *giant* API > change, and it's also a giant *ABI-compatible* API change. Meaning, replacing > socket ID with heap ID will not cause compile failures for old code, which > would result in many subtle bugs in already existing codebases. So, while in > the perfect world this would've been my preferred approach, realistically i > think this is a very, very bad idea. > > Second one is to add a separate "heap name" API's to everything. This has an > upside of clean separation between allocation from internal and external > heaps. (well, whether it's an upside is debatable...) This is the approach i > expected to take when i was creating this patchset. > > The downside is that we have to add new API's to every library and every DPDK > data structure, to allow explicit allocation from external heaps. We will > have to maintain both, and things like hardware drivers will need to have a > way to indicate the need to allocate things from a particular external heap. > > The third way is to expose the "heap ID" externally, and allow a single, > unified API to reserve memory. That is, create an API that would map either a > NUMA node ID or a heap name to an ID, and allow reserving memory through that > ID regardless of whether it's internal or external memory. This would also > allow to gradually phase out socket-based ID's in favor of heap ID API, > should we choose to do so. > > The downside for this is, it adds a layer of indirection between socket ID > and reserving memory on a particular NUMA node, and it makes it hard to > produce a single value of "heap ID" in such a way as to replicate current > functionality of allocating with SOCKET_ID_ANY. Most likely user will have to > explicitly try to allocate on all sockets, unless we keep old API's around in > parallel. > > Finally, a fourth way would be to abuse the socket ID to also mean something > else, which is an approach i've seen numerous times already, and one that i > don't like. We could register new heaps as a new, fake socket ID, and use > that to address external heaps (each heap would get its own socket). So, keep > current socket ID behavior, but for non-existent sockets it would be possible > to be registered as a fake socket pointing to an external heap. > > The upside for this approach would be that no API changes are required > whatsoever to existing libraries - this scheme is compatible with both > internal and external heaps without adding a separate API. > > The downside is bad semantics - "special" sockets, handling of SOCKET_ID_ANY, > handling of "invalid socket" vs. "invalid socket that happens to correspond > to an existing external heap", and many other things that can be confusing. I > don't like this option, but it's an option :) > > Thoughts? Comments?
#1 is super clean, but very disruptive to everyone. Very Bad IMO #2 is also clean, but adds a lot of new APIs that everyone needs to use or at least in the external heap cases. #3 not sure I fully understand it, but reproducing heap IDs for testing is a problem and requires new/old APIs #4 Very easy to add, IMO it is clean and very small disruption to developers. It does require the special handling, but I feel it is OK and can be explained in the docs. Having a socket id as an ‘int’ gives us a lot room e.g. id < 64K is normal socket and > 64K is external id. My vote would be #4, as it seems the least risk and work. :-) > > I myself still favor the second way, however there are good arguments to be > made for each of these options. > > -- > Thanks, > Anatoly Regards, Keith