On 12-Mar-19 1:47 AM, Lilijun (Jerry, Cloud Networking) wrote:
Hi Anatoly,
-----Original Message-----
From: Burakov, Anatoly [mailto:anatoly.bura...@intel.com]
Sent: Friday, March 08, 2019 5:38 PM
To: Lilijun (Jerry, Cloud Networking) <jerry.lili...@huawei.com>;
dev@dpdk.org
Cc: jerry.zh...@intel.com; ian.sto...@intel.com
Subject: Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for
legacy mem
On 08-Mar-19 5:38 AM, Lilijun wrote:
Comparing dpdk VA spaces to dpdk 16.11, the dpdk app process's VA
spaces increase to above 30G.
Here we can unmap the unneed VA spaces in rte_memseg_list.
Signed-off-by: Lilijun <jerry.lili...@huawei.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 32feb41..56abdd2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1626,8 +1626,19 @@ void numa_error(char *where)
if (msl->base_va == NULL)
continue;
/* skip lists where there is at least one page allocated */
- if (msl->memseg_arr.count > 0)
+ if (msl->memseg_arr.count > 0) {
+ if (internal_config.legacy_mem) {
+ struct rte_fbarray *arr = &msl->memseg_arr;
+ int idx = rte_fbarray_find_next_free(arr, 0);
+
+ while (idx >= 0) {
+ void *va = (void*)((char*)msl-
base_va + idx * msl->page_sz);
+ munmap(va, msl->page_sz);
+ idx = rte_fbarray_find_next_free(arr,
idx + 1);
+ }
I am not entirely convinced this change is safe to do. Technically, this space
is
marked as free, so correctly written code should not attempt to access it,
however it is still potentially dangerous to have memory area that is
supposed to be allocated (according to data structures'
parameters), but isn't.
If you are deallocating the VA space, ideally you should also resize the
memseg list (as in, change its length), because that leftover memory area is
no longer valid. However, this then presents us with a mismatch between
(va_start + len) and (va_start + page_sz * memseg_arr.len), which may
break things further.
Yes, you're right, here we need resize the memseg length. I will update it if
this patch is needed.
Resizing memseg list is not the best course of action because fbarray
itself doesn't support resizing, so you'll end up with a mismatch
between length of memory and length of fbarray backing the memseg list.
See below suggestion for implementation.
May i ask what is the purpose of this change? I mean, i understand the part
about unused VA space sitting there, but what is the consequence of that?
This isn't 32-bit codepath, and in 64-bit there's plenty of address space to go
around, and this memory doesn't take up any system resources anyway
because it is read-only anonymous memory, and is therefore backed by zero
page instead of real pages. So, what's wrong with just leaving it there?
This change will cause a issues: when dpdk apps crashed, the coredump file
will become too large.
Thanks.
You must have different default coredump settings than i do, because i
haven't seen Linux attempting to dump the entire address space before (i
have seen FreeBSD do that, mind you...).
I don't see any advantage of this change, and i see plenty of disadvantages,
so for now i'm inclined to NACK this particular patch.
_However_, i should note that if you feel this is very important feature to
have and would still like to implement it, my advise would be to look at how
32-bit code works, and model the 64-bit implementation after that, because
32-bit codepath does exactly what you propose, and doesn't leave unused
address space.
The above is the way to go as far as implementing this particular
feature goes: this has to be done at memseg list allocation time, not
post-factum, when memseg lists are already allocated.
+ } > continue;
+ }
/* this is an unused list, deallocate it */
mem_sz = msl->len;
munmap(msl->base_va, mem_sz);
--
Thanks,
Anatoly
--
Thanks,
Anatoly