On Sun, 28 May 2023 23:07:40 +0300 Baruch Even <bar...@weka.io> wrote:
> Hi, > > We found an issue with newer kernels (5.13+) that are found on newer OSes > (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was > allocated for DPDK was migrated (moved into another physical page) when a > 1G page was allocated. > > From our reading of the kernel commits this started with commit > ae37c7ff79f1f030e28ec76c46ee032f8fd07607 > mm: make alloc_contig_range handle in-use hugetlb pages > > This caused what looked like memory corruptions to us and cases where the > rings were moved from their physical location and communication was no > longer possible. > > I wanted to ask if anyone else hit this issue and what mitigations are > available? > > We are currently looking at using a kernel driver to pin the pages but I > expect that this issue will affect others and that a more general approach > is needed. > > Thanks, > Baruch Report this to upstream kernel regressions, they probably care about it. Doing a kernel driver hack is overkill, maintenance and long term technical debt problem.