On 3/23/26 4:01 AM, Li Wang wrote:
On Fri, Mar 20, 2026 at 04:42:38PM -0400, Waiman Long wrote:
It was found that some of the tests in test_memcontrol can fail more
readily if system page size is larger than 4k. It is because the
actual memory.current value deviates more from the expected value with
larger page size. This is likely due to the fact there may be up to
MEMCG_CHARGE_BATCH pages of charge hidden in each one of the percpu
memcg_stock.

To avoid this failure, the error tolerance is now increased in accordance
to the current system page size value. The page size scale factor is
set to 2 for 64k page and 1 for 16k page.

Changes are made in alloc_pagecache_max_30M(), test_memcg_protection()
and alloc_anon_50M_check_swap() to increase the error tolerance for
memory.current for larger page size. The current set of values are
chosen to ensure that the relevant test_memcontrol tests no longer
have any test failure in a 100 repeated run of test_memcontrol with a
4k/16k/64k page size kernels on an arm64 system.

Signed-off-by: Waiman Long <[email protected]>
---
  .../cgroup/lib/include/cgroup_util.h          |  3 ++-
  .../selftests/cgroup/test_memcontrol.c        | 23 ++++++++++++++-----
  2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h 
b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
index 77f386dab5e8..2293e770e9b4 100644
--- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
+++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
@@ -6,7 +6,8 @@
  #define PAGE_SIZE 4096
  #endif
-#define MB(x) (x << 20)
+#define KB(x) ((x) << 10)
+#define MB(x) ((x) << 20)
#define USEC_PER_SEC 1000000L
  #define NSEC_PER_SEC  1000000000L
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c 
b/tools/testing/selftests/cgroup/test_memcontrol.c
index babbfad10aaf..c078fc458def 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -26,6 +26,7 @@
  static bool has_localevents;
  static bool has_recursiveprot;
  static int page_size;
+static int pscale_factor;      /* Page size scale factor */
int get_temp_fd(void)
  {
@@ -571,16 +572,17 @@ static int test_memcg_protection(const char *root, bool 
min)
        if (cg_run(parent[2], alloc_anon, (void *)MB(148)))
                goto cleanup;
- if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
+       if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50),
+                                      3 + (min ? 0 : 4) * pscale_factor))
                goto cleanup;
for (i = 0; i < ARRAY_SIZE(children); i++)
                c[i] = cg_read_long(children[i], "memory.current");
- if (!values_close(c[0], MB(29), 15))
+       if (!values_close(c[0], MB(29), 15 + 3 * pscale_factor))
                goto cleanup;
- if (!values_close(c[1], MB(21), 20))
+       if (!values_close(c[1], MB(21), 20 + pscale_factor))
                goto cleanup;
if (c[3] != 0)
@@ -596,7 +598,8 @@ static int test_memcg_protection(const char *root, bool min)
        }
current = min ? MB(50) : MB(30);
-       if (!values_close(cg_read_long(parent[1], "memory.current"), current, 
3))
+       if (!values_close(cg_read_long(parent[1], "memory.current"), current,
+                                      9 + (min ? 0 : 6) * pscale_factor))
                goto cleanup;
if (!reclaim_until(children[0], MB(10)))
@@ -684,7 +687,7 @@ static int alloc_pagecache_max_30M(const char *cgroup, void 
*arg)
                goto cleanup;
current = cg_read_long(cgroup, "memory.current");
-       if (!values_close(current, MB(30), 5))
+       if (!values_close(current, MB(30), 5 + (pscale_factor ? 2 : 0)))
                goto cleanup;
ret = 0;
@@ -1004,7 +1007,7 @@ static int alloc_anon_50M_check_swap(const char *cgroup, 
void *arg)
                *ptr = 0;
mem_current = cg_read_long(cgroup, "memory.current");
-       if (!mem_current || !values_close(mem_current, mem_max, 3))
+       if (!mem_current || !values_close(mem_current, mem_max, 6 + 
pscale_factor))
                goto cleanup;
swap_current = cg_read_long(cgroup, "memory.swap.current");
@@ -1684,6 +1687,14 @@ int main(int argc, char **argv)
        if (page_size <= 0)
                page_size = PAGE_SIZE;
+ /*
+        * It is found that the actual memory.current value can deviate more
+        * from the expected value with larger page size. So error tolerance
+        * will have to be increased a bit more for larger page size.
+        */
+       if (page_size > KB(4))
+               pscale_factor = (page_size >= KB(64)) ? 2 : 1;
This is a good improment but I still think the pscale_factor adjustments
are a bit fragile, each call site needs its own hand-tuned formula, and only
three page sizes (4K/16K/64K) are handled. If a new page size shows up,
every call site needs revisiting.

How about centralizing the page size adjustment inside values_close()
itself? Something like:

     static inline int values_close(long a, long b, int err)
     {
           ssize_t page_adjusted_err = ffs(page_size >> 13) + err;
return 100 * labs(a - b) <= (a + b) * page_adjusted_err;
     }

This adds one extra percent of tolerance per doubling above 4K, scales
continuously for any power-of-two page size, and also fixes an integer
truncation issue in the original: (a + b) / 100 * err loses precision
when (a + b) < 100.

With this, the callers wouldn't need any changes at all.

This method is inspired from LTP:
   
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/controllers/memcg/memcontrol_common.h#L27

Good point. I will implement something like in the next version.

Cheers,
Longman


Reply via email to