Patch sent to ML: https://lists.ubuntu.com/archives/kernel- team/2024-December/155919.html
SRU Justification: [Impact] Since upstream commit 003af997c8a9 ("hugetlb: force allocating surplus hugepages on mempolicy allowed nodes"), the LTP hugetlb test case 'hugemmap10' has started failing frequently. In some testing environments, it fails almost consistently. This occurs because the surplus allocation behavior was changed, typically leading to allocations being made on the least numbered possible NUMA node, even when allocation from the NUMA node on which the current task running is feasible. This not only causes LTP test failures but also indicates potential performance degradation. This patch restores the previous behavior while preserving the benefits introduced by commit 003af997c8a9. It not only reduces LTP test failure noise but also prevents performance degradations. [Fix] The following upstream commit resolves the issue: https://lore.kernel.org/all/20241204165503.628784-1-koichiro....@canonical.com It does not currently meet stable criteria, so it's unlikely that it will land in upstream stable tree. That is why this SRU patch. [Test Plan] Run the LTP hugemmap10 and verify that it passes consistently on machines where it previously failed. If such machines are not available, compare behavior with and without the patch by performing the following steps: 1. prepare a machine which has two or more NUMA nodes, each having usable memory 2. run hugemmap10 with naively pinning it onto some CPU of NUMA node #1; $ sudo taskset -c {some-cpu-on-node-1} ./hugemmap10 Without this patch, you will observe consistent failures, while with this patch, the issue will disappear. [Where problems could occur] This change affects only hugetlb. If any regressions are found, they would likely impact hugetlb users. -- You received this bug notification because you are a member of Canonical Platform QA Team, which is subscribed to ubuntu-kernel-tests. https://bugs.launchpad.net/bugs/2089768 Title: hugemmap10 from ubuntu_ltp_stable.hugetlb failed on Noble Status in ubuntu-kernel-tests: New Bug description: Test hugemmap10 of ubuntu_ltp_stable.hugetlb was observed to fail: * N-lowlatency-6.8.0-50.51.1 (drapion, amd64) * N-nvidia-lowlatency-6.8.0-1019.21.1 (bunsen, amd64) * J-linux-lowlatency-hwe-6.8-6.8.0-50.51.1~22.04.1 (recht, arm64) ---- 07:35:55 DEBUG| [stdout] startup='Tue Nov 26 07:35:38 2024' 07:35:55 DEBUG| [stdout] tst_hugepage.c:84: TINFO: 3 hugepage(s) reserved 07:35:55 DEBUG| [stdout] tst_tmpdir.c:316: TINFO: Using /tmp/ltp-8CAxsmEARh/LTP_hugLIyG9r as tmpdir (ext2/ext3/ext4 filesystem) 07:35:55 DEBUG| [stdout] tst_test.c:1085: TINFO: Mounting none to /tmp/ltp-8CAxsmEARh/LTP_hugLIyG9r/hugetlbfs fstyp=hugetlbfs flags=0 07:35:55 DEBUG| [stdout] tst_test.c:1860: TINFO: LTP version: 20230929-874-gba610da01 07:35:55 DEBUG| [stdout] tst_test.c:1864: TINFO: Tested kernel: 6.8.0-50-lowlatency #51.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 21 12:44:17 UTC 2024 x86_64 07:35:55 DEBUG| [stdout] tst_test.c:1703: TINFO: Timeout per run is 0h 00m 30s 07:35:55 DEBUG| [stdout] hugemmap10.c:388: TINFO: Base pool size: 0 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Clean... 07:35:55 DEBUG| [stdout] hugemmap10.c:366: TINFO: OK 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Untouched, shared... 07:35:55 DEBUG| [stdout] hugemmap10.c:366: TINFO: OK 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Untouched, private... 07:35:55 DEBUG| [stdout] hugemmap10.c:366: TINFO: OK 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Touched, shared... 07:35:55 DEBUG| [stdout] hugemmap10.c:366: TINFO: OK 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Touched, private... 07:35:55 DEBUG| [stdout] hugemmap10.c:366: TINFO: OK 07:35:55 DEBUG| [stdout] hugemmap10.c:388: TINFO: Base pool size: 1 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Clean... 07:35:55 DEBUG| [stdout] hugemmap10.c:366: TINFO: OK 07:35:55 DEBUG| [stdout] hugemmap10.c:315: TINFO: Untouched, shared... 07:35:55 DEBUG| [stdout] hugemmap10.c:333: TFAIL: While doing munmap shared after touch: Bad HugePages_Total: expected 1, actual 2 07:35:55 DEBUG| [stdout] hugemmap10.c:333: TFAIL: While doing munmap shared after touch: Bad HugePages_Free: expected 1, actual 2 07:35:55 DEBUG| [stdout] hugemmap10.c:333: TFAIL: While doing munmap shared after touch: Bad HugePages_Surp: expected 0, actual 1 07:35:55 DEBUG| [stdout] 07:35:55 DEBUG| [stdout] Summary: 07:35:55 DEBUG| [stdout] passed 0 07:35:55 DEBUG| [stdout] failed 3 07:35:55 DEBUG| [stdout] broken 0 07:35:55 DEBUG| [stdout] skipped 0 07:35:55 DEBUG| [stdout] warnings 0 07:35:55 DEBUG| [stdout] tag=hugemmap10 stime=1732606538 dur=0 exit=exited stat=1 core=no cu=0 cs=4 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2089768/+subscriptions -- Mailing list: https://launchpad.net/~canonical-ubuntu-qa Post to : canonical-ubuntu-qa@lists.launchpad.net Unsubscribe : https://launchpad.net/~canonical-ubuntu-qa More help : https://help.launchpad.net/ListHelp