2024-12-26 16:10 (UTC+0800), Yang Ming: > Fix the issue where OS memory is mistakenly freed with rte_free > by setting the length (len) of unused memseg to 0. > > When eal_legacy_hugepage_init releases the VA space for unused > memseg lists, it does not reset their length to 0. As a result, > mlx5_mem_is_rte may incorrectly identify OS memory as DPDK > memory. This can lead to mlx_free calling rte_free on OS memory, > causing an "EAL: Error: Invalid memory" log and failing to free > the OS memory. > > This issue is occasional and occurs when the DPDK program’s > memory map places the heap address range between 0 and len(32G). > In such cases, malloc may return an address less than len, > causing mlx5_mem_is_rte to incorrectly treat it as DPDK memory. > > Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") > Cc: anatoly.bura...@intel.com > Cc: sta...@dpdk.org > > Signed-off-by: Yang Ming <ming.1.y...@nokia-sbell.com> > --- > lib/eal/linux/eal_memory.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c > index 45879ca743..9dda60c0e1 100644 > --- a/lib/eal/linux/eal_memory.c > +++ b/lib/eal/linux/eal_memory.c > @@ -1472,6 +1472,7 @@ eal_legacy_hugepage_init(void) > mem_sz = msl->len; > munmap(msl->base_va, mem_sz); > msl->base_va = NULL; > + msl->len = 0; > msl->heap = 0; > > /* destroy backing fbarray */
Hi Yang, It seems the bug affects more than just mlx5 PMD. Consider how the MSL with `base_va == NULL` ends up in `mlx5_mem_is_rte()`. It comes from `rte_mem_virt2memseg_list()` which iterates MSLs and checks that an address belongs to [`base_va`; `base_va+len`) without checking whether `base_va == NULL` i.e. that the MSL is inactive. Your patch also corrects `rte_mem_virt2memseg_list()` behavior. Please mention this in the commit message. Acked-by: Dmitry Kozlyuk <dmitry.kozl...@gmail.com>