2021-03-19 09:51 (UTC-0700), Jie Zhou:
> Issue under active investigation:
> - Recent DPDK upstream change "eal: detach memsegs on cleanup" exposed
>   failure at eal exit with "EAL: Could not unmap memory: No Error".
>   Investigating KERNELBASE!UnmapViewOfFile. The issue will cause system
>   crash. Currently temporarily remove cleanup at exit on Windows.
>   Will revert after issue root caused and fixed

+Anatoly

It's my fault I assumed "eal: detach memsegs on cleanup" series related to
multiprocess only and didn't properly review it.

The culprit is this code (eal_common_memory.c:1019):

for (i = 0; i < RTE_DIM(mcfg->memsegs); i++) {
        struct rte_memseg_list *msl = &mcfg->memsegs[i];

        /* skip uninitialized segments */
        if (msl->base_va == NULL)
                continue;
        /*
         * external segments are supposed to be detached at this point,
         * but if they aren't, we can't really do anything about it,
         * because if we skip them here, they'll become invalid after
         * we unmap the memconfig anyway. however, if this is externally
         * referenced memory, we have no business unmapping it.
         */
        if (!msl->external)
                if (rte_mem_unmap(msl->base_va, msl->len) != 0)
                        RTE_LOG(ERR, EAL, "Could not unmap memory: %s\n",
                                strerror(errno));

1. It assumes memory is allocated using mapping, which is not the case for
Windows. Instead of rte_mem_unmap() it should be eal_mem_free(), which is the
same munmap() on Unices. However...

2. It assumes this line will remove all mappings within (base_va, size), as
munmap()/rte_mem_unmap() would do. However, eal_mem_free(base_va, size) is
only guaranteed to work if (base_va, size) came from eal_mem_reserve(size) or
from OS-specific allocation (mmap on Unices, VirtualAlloc2 on Windows).
Because of underlying munmap, it works as desired on Unices, but not on
Windows.

3. A minor, but still: errno -> rte_errno, strerror -> rte_strerror.

I can make eal_mem_free() behave as expected at issue 2.
Or should we simple disable this code when where's no multiprocess support
(how do we test for it properly, BTW)?

Reply via email to