On Sun, 4 Oct 2020, kernel test robot wrote: > Greeting, > > FYI, we noticed a -8.7% regression of vm-scalability.throughput due to commit: > > > commit: 85b9f46e8ea451633ccd60a7d8cacbfff9f34047 ("mm, thp: track fallbacks > due to failed memcg charges separately") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > in testcase: vm-scalability > on test machine: 104 threads Skylake with 192G memory > with following parameters: > > runtime: 300s > size: 1T > test: lru-shm > cpufreq_governor: performance > ucode: 0x2006906 > > test-description: The motivation behind this suite is to exercise functions > and regions of the mm/ of the Linux kernel which are of interest to us. > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ > > > > If you fix the issue, kindly add following tag > Reported-by: kernel test robot <rong.a.c...@intel.com> > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > To reproduce: > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > bin/lkp install job.yaml # job file is attached in this email > bin/lkp run job.yaml > > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode: > > gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/1T/lkp-skl-fpga01/lru-shm/vm-scalability/0x2006906 > > commit: > dcdf11ee14 ("mm, shmem: add vmstat for hugepage fallback") > 85b9f46e8e ("mm, thp: track fallbacks due to failed memcg charges > separately") > > dcdf11ee14413332 85b9f46e8ea451633ccd60a7d8c > ---------------- --------------------------- > fail:runs %reproduction fail:runs > | | | > 1:4 24% 2:4 > perf-profile.calltrace.cycles-pp.sync_regs.error_entry.do_access > 3:4 53% 5:4 > perf-profile.calltrace.cycles-pp.error_entry.do_access > 9:4 -27% 8:4 > perf-profile.children.cycles-pp.error_entry > 4:4 -10% 4:4 > perf-profile.self.cycles-pp.error_entry > %stddev %change %stddev > \ | \ > 477291 -9.1% 434041 vm-scalability.median > 49791027 -8.7% 45476799 vm-scalability.throughput > 223.67 +1.6% 227.36 vm-scalability.time.elapsed_time > 223.67 +1.6% 227.36 > vm-scalability.time.elapsed_time.max > 50364 ± 6% +24.1% 62482 ± 10% > vm-scalability.time.involuntary_context_switches > 2237 +7.8% 2412 > vm-scalability.time.percent_of_cpu_this_job_got > 3084 +18.2% 3646 vm-scalability.time.system_time > 1921 -4.2% 1839 vm-scalability.time.user_time > 13.68 +2.2 15.86 mpstat.cpu.all.sys% > 28535 ± 30% -47.0% 15114 ± 79% numa-numastat.node0.other_node > 142734 ± 11% -19.4% 115000 ± 17% numa-meminfo.node0.AnonPages > 11168 ± 3% +8.8% 12150 ± 5% numa-meminfo.node1.PageTables > 76.00 -1.6% 74.75 vmstat.cpu.id > 3626 -1.9% 3555 vmstat.system.cs > 2214928 ±166% -96.6% 75321 ± 7% cpuidle.C1.usage > 200981 ± 7% -18.0% 164861 ± 7% cpuidle.POLL.time > 52675 ± 3% -16.7% 43866 ± 10% cpuidle.POLL.usage > 35659 ± 11% -19.4% 28754 ± 17% numa-vmstat.node0.nr_anon_pages > 1248014 ± 3% +10.9% 1384236 numa-vmstat.node1.nr_mapped > 2722 ± 4% +10.6% 3011 ± 5% > numa-vmstat.node1.nr_page_table_pages
I'm not sure that I'm reading this correctly, but I suspect that this just happens because of NUMA: memory affinity will obviously impact vm-scalability.throughput quite substantially, but I don't think the bisected commit can be to be blame. Commit 85b9f46e8ea4 ("mm, thp: track fallbacks due to failed memcg charges separately") simply adds new count_vm_event() calls in a couple areas to track thp fallback due to memcg limits separate from fragmentation. It's likely a question about the testing methodology in general: for memory intensive benchmarks, I suggest it is configured in a manner that we can expect consistent memory access latency at the hardware level when running on a NUMA system.