On 13-Dec-18 9:55 PM, Jim Harris wrote:
SPDK uses the rte_mem_event_callback_register API to
create RDMA memory regions (MRs) for newly allocated regions
of memory. This is used in both the SPDK NVMe-oF target
and the NVMe-oF host driver.
Hi Jim,
Here and in other places - it appears that you're adding double spaces
after end of sentence. We generally don't do that in DPDK.
DPDK creates internal malloc_elem structures for these
allocated regions. As users malloc and free memory, DPDK
will sometimes merge malloc_elems that originated from
different allocations that were notified through the
registered mem_event callback routine. This results
in subsequent allocations that can span across multiple
RDMA MRs. This requires SPDK to check each DPDK buffer to
see if it crosses an MR boundary, and if so, would have to
add considerable logic and complexity to describe that
buffer before it can be accessed by the RNIC. It is somewhat
analagous to rte_malloc returning a buffer that is not
IOVA-contiguous.
As a malloc_elem gets split and some of these elements
get freed, it can also result in DPDK sending an
RTE_MEM_EVENT_FREE notification for a subset of the
original RTE_MEM_EVENT_ALLOC notification. This is also
problematic for RDMA memory regions, since unregistering
the memory region is all-or-nothing. It is not possible
to unregister part of a memory region.
To support these types of applications, this patch adds
a new --match-allocations EAL init flag. When this
flag is specified, malloc elements from different
hugepage allocations will never be merged. Memory will
also only be freed back to the system (with the requisite
memory event callback) exactly as it was originally
allocated.
Since part of this patch is extending the size of struct
malloc_elem, we also fix up the malloc autotests so they
do not assume its size exactly fits in one cacheline.
Signed-off-by: Jim Harris <james.r.har...@intel.com>
---
doc/guides/prog_guide/env_abstraction_layer.rst | 13 +++++++++++++
lib/librte_eal/common/eal_common_options.c | 6 ++++++
lib/librte_eal/common/eal_internal_cfg.h | 1 +
lib/librte_eal/common/eal_options.h | 2 ++
lib/librte_eal/common/malloc_elem.c | 16 ++++++++++++----
lib/librte_eal/common/malloc_elem.h | 6 +++++-
lib/librte_eal/common/malloc_heap.c | 13 ++++++++++++-
lib/librte_eal/linuxapp/eal/eal.c | 5 +++++
test/test/test_malloc.c | 25 +++++++++++++++++++------
9 files changed, 75 insertions(+), 12 deletions(-)
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 8b5d050c7..c4032161f 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -169,6 +169,19 @@ not allow acquiring or releasing hugepages from the system
at runtime.
If neither ``-m`` nor ``--socket-mem`` were specified, the entire available
hugepage memory will be preallocated.
++ Hugepage allocation matching
+
+This behavior is enabled by specifying the ``--match-allocations`` command-line
+switch to the EAL. This switch is mutually exclusive with ``--legacy-mem``.
I wouldn't say it's "mutually exclusive" as much as it's simply
unsupported because there's no dynamic memory allocation in legacy mode.
(it's also unsupported with --no-huge as well). It is also a Linux-only
flag.
Please also update doc/guides/linux_gsg/linux_eal_parameters.rst to have
this parameter. It would also be a good idea to add a note to
doc/guides/rel_notes/release_19.02.rst.
+
+Some applications using memory event callbacks may require that hugepages be
+freed exactly as they were allocated. These applications may also require
+that any allocation from the malloc heap not span across allocations
+associated with two different memory event callbacks. Hugepage allocation
+matching can be used by these types of applications to satisfy both of these
+requirements. This can result in some increased memory usage which is
+very dependent on the memory allocation patterns of the application.
+
+ 32-bit support
Additional restrictions are present when running in 32-bit mode. In dynamic
diff --git a/lib/librte_eal/common/eal_common_options.c
b/lib/librte_eal/common/eal_common_options.c
index e31eca5c0..034447929 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -79,6 +79,7 @@ eal_long_options[] = {
{OPT_VMWARE_TSC_MAP, 0, NULL, OPT_VMWARE_TSC_MAP_NUM },
{OPT_LEGACY_MEM, 0, NULL, OPT_LEGACY_MEM_NUM },
{OPT_SINGLE_FILE_SEGMENTS, 0, NULL, OPT_SINGLE_FILE_SEGMENTS_NUM},
+ {OPT_MATCH_ALLOCATIONS, 0, NULL, OPT_MATCH_ALLOCATIONS_NUM},
{0, 0, NULL, 0 }
};
@@ -1424,6 +1425,11 @@ eal_check_common_options(struct internal_config *internal_cfg)
"with --"OPT_IN_MEMORY"\n");
return -1;
}
+ if (internal_cfg->legacy_mem && internal_cfg->match_allocations) {
+ RTE_LOG(ERR, EAL, "Option --"OPT_LEGACY_MEM" is not compatible "
+ "with --"OPT_MATCH_ALLOCATIONS"\n");
+ return -1;
+ }
Probably should add a check for --no-huge as well.
Everything else looks OK to me, so, provided the above comments are
addressed,
Reviewed-by: Anatoly Burakov <anatoly.bura...@intel.com>
--
Thanks,
Anatoly