On Sun, 28 May 2023 23:07:40 +0300
Baruch Even <bar...@weka.io> wrote:

> Hi,
> 
> We found an issue with newer kernels (5.13+) that are found on newer OSes
> (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> allocated for DPDK was migrated (moved into another physical page) when a
> 1G page was allocated.
> 
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
>     mm: make alloc_contig_range handle in-use hugetlb pages
> 
> This caused what looked like memory corruptions to us and cases where the
> rings were moved from their physical location and communication was no
> longer possible.
> 
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
> 
> We are currently looking at using a kernel driver to pin the pages but I
> expect that this issue will affect others and that a more general approach
> is needed.
> 
> Thanks,
> Baruch
> 

Fix might be as simple as asking kernel to lock the mmap().

diff --git a/lib/eal/linux/eal_hugepage_info.c 
b/lib/eal/linux/eal_hugepage_info.c
index 581d9dfc91eb..989c69387233 100644
--- a/lib/eal/linux/eal_hugepage_info.c
+++ b/lib/eal/linux/eal_hugepage_info.c
@@ -48,7 +48,8 @@ map_shared_memory(const char *filename, const size_t 
mem_size, int flags)
                return NULL;
        }
        retval = mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
-                       MAP_SHARED, fd, 0);
+                       MAP_SHARED_VALIDATE | MAP_LOCKED, fd, 0);
+
        close(fd);
        return retval == MAP_FAILED ? NULL : retval;
 }

Reply via email to