2024-12-26 16:10 (UTC+0800), Yang Ming:
> Fix the issue where OS memory is mistakenly freed with rte_free
> by setting the length (len) of unused memseg to 0.
> 
> When eal_legacy_hugepage_init releases the VA space for unused
> memseg lists, it does not reset their length to 0. As a result,
> mlx5_mem_is_rte may incorrectly identify OS memory as DPDK
> memory. This can lead to mlx_free calling rte_free on OS memory,
> causing an "EAL: Error: Invalid memory" log and failing to free
> the OS memory.
> 
> This issue is occasional and occurs when the DPDK program’s
> memory map places the heap address range between 0 and len(32G).
> In such cases, malloc may return an address less than len,
> causing mlx5_mem_is_rte to incorrectly treat it as DPDK memory.
> 
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: anatoly.bura...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yang Ming <ming.1.y...@nokia-sbell.com>
> ---
>  lib/eal/linux/eal_memory.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
> index 45879ca743..9dda60c0e1 100644
> --- a/lib/eal/linux/eal_memory.c
> +++ b/lib/eal/linux/eal_memory.c
> @@ -1472,6 +1472,7 @@ eal_legacy_hugepage_init(void)
>               mem_sz = msl->len;
>               munmap(msl->base_va, mem_sz);
>               msl->base_va = NULL;
> +             msl->len = 0;
>               msl->heap = 0;
>  
>               /* destroy backing fbarray */

Hi Yang,

It seems the bug affects more than just mlx5 PMD.

Consider how the MSL with `base_va == NULL` ends up in `mlx5_mem_is_rte()`.
It comes from `rte_mem_virt2memseg_list()` which iterates MSLs
and checks that an address belongs to [`base_va`; `base_va+len`)
without checking whether `base_va == NULL` i.e. that the MSL is inactive.
Your patch also corrects `rte_mem_virt2memseg_list()` behavior.
Please mention this in the commit message.

Acked-by: Dmitry Kozlyuk <dmitry.kozl...@gmail.com>

Reply via email to