Hi, On 8 January 2018 at 10:34, Vincent Guittot <vincent.guit...@linaro.org> wrote: > Hi Xiaolong, > > On 25 December 2017 at 07:07, kernel test robot <xiaolong...@intel.com> wrote: >> >> Greeting, >> >> FYI, we noticed a -4.3% regression of unixbench.score due to commit: >> >> >> commit: a4c3c04974d648ee6e1a09ef4131eb32a02ab494 ("sched/fair: Update and >> fix the runnable propagation rule") >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master >> >> in testcase: unixbench >> on test machine: 8 threads Ivy Bridge with 16G memory >> with following parameters: >> >> runtime: 300s >> nr_task: 100% >> test: shell1 >> cpufreq_governor: performance >> >> test-description: UnixBench is the original BYTE UNIX benchmark suite aims >> to test performance of Unix-like system. >> test-url: https://github.com/kdlucas/byte-unixbench >> > > I don't have the machine described above so i have tried to reproduce > the problem on my 8 cores cortex A53 platform but I don't have > performance regression. > I have also tried with a VM on a Intel(R) Core(TM) i7-4810MQ and > haven't seen regression too. > > Have you seen the regression on other platform ?
I have been able to run the test on a 12 cores Intel(R) Xeon(R) CPU E5-2630 and haven't seen any regression as well I have changed the command to ./Run Shell1 -c 12 -i 30 instead of ./Run Shell1 -c 8 -i 30 as there were more cores Regards, Vincent > > Regards, > Vincent > >> >> Details are as below: >> --------------------------------------------------------------------------------------------------> >> >> >> To reproduce: >> >> git clone https://github.com/intel/lkp-tests.git >> cd lkp-tests >> bin/lkp install job.yaml # job file is attached in this email >> bin/lkp run job.yaml >> >> ========================================================================================= >> compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase: >> >> gcc-7/performance/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/300s/lkp-ivb-d01/shell1/unixbench >> >> commit: >> c6b9d9a330 ("sched/wait: Fix add_wait_queue() behavioral change") >> a4c3c04974 ("sched/fair: Update and fix the runnable propagation rule") >> >> c6b9d9a330290144 a4c3c04974d648ee6e1a09ef41 >> ---------------- -------------------------- >> %stddev %change %stddev >> \ | \ >> 13264 -4.3% 12694 unixbench.score >> 10619292 -11.7% 9374917 >> unixbench.time.involuntary_context_switches >> 4.829e+08 -4.3% 4.62e+08 >> unixbench.time.minor_page_faults >> 1126 -3.6% 1086 unixbench.time.system_time >> 2645 -3.0% 2566 unixbench.time.user_time >> 15855720 -6.2% 14878247 >> unixbench.time.voluntary_context_switches >> 0.00 ą 56% -0.0 0.00 ą 57% mpstat.cpu.iowait% >> 79517 -5.7% 74990 vmstat.system.cs >> 16361 -3.3% 15822 vmstat.system.in >> 1.814e+08 -24.0% 1.379e+08 cpuidle.C1.time >> 3436399 -20.6% 2728227 cpuidle.C1.usage >> 7772815 -9.9% 7001076 cpuidle.C1E.usage >> 1.479e+08 +66.1% 2.456e+08 cpuidle.C3.time >> 1437889 +38.7% 1994073 cpuidle.C3.usage >> 18147 +13.9% 20676 cpuidle.POLL.usage >> 3436173 -20.6% 2727580 turbostat.C1 >> 3.54 -0.8 2.73 turbostat.C1% >> 7772758 -9.9% 7001012 turbostat.C1E >> 1437858 +38.7% 1994034 turbostat.C3 >> 2.88 +2.0 4.86 turbostat.C3% >> 18.50 +10.8% 20.50 turbostat.CPU%c1 >> 0.54 ą 2% +179.6% 1.51 turbostat.CPU%c3 >> 2.32e+12 -4.3% 2.22e+12 perf-stat.branch-instructions >> 6.126e+10 -4.9% 5.823e+10 perf-stat.branch-misses >> 8.64 ą 4% +0.6 9.25 perf-stat.cache-miss-rate% >> 1.662e+11 -4.3% 1.59e+11 perf-stat.cache-references >> 51040611 -7.0% 47473754 perf-stat.context-switches >> 1.416e+13 -3.6% 1.365e+13 perf-stat.cpu-cycles >> 8396968 -3.9% 8065835 perf-stat.cpu-migrations >> 2.919e+12 -4.3% 2.793e+12 perf-stat.dTLB-loads >> 1.89e+12 -4.3% 1.809e+12 perf-stat.dTLB-stores >> 67.97 +1.1 69.03 perf-stat.iTLB-load-miss-rate% >> 4.767e+09 -1.3% 4.704e+09 perf-stat.iTLB-load-misses >> 2.247e+09 -6.0% 2.111e+09 perf-stat.iTLB-loads >> 1.14e+13 -4.3% 1.091e+13 perf-stat.instructions >> 2391 -3.0% 2319 >> perf-stat.instructions-per-iTLB-miss >> 4.726e+08 -4.3% 4.523e+08 perf-stat.minor-faults >> 4.726e+08 -4.3% 4.523e+08 perf-stat.page-faults >> 585.14 ą 4% -55.0% 263.59 ą 12% >> sched_debug.cfs_rq:/.load_avg.avg >> 1470 ą 4% -42.2% 850.09 ą 24% >> sched_debug.cfs_rq:/.load_avg.max >> 154.17 ą 22% -49.2% 78.39 ą 7% >> sched_debug.cfs_rq:/.load_avg.min >> 438.33 ą 6% -41.9% 254.49 ą 27% >> sched_debug.cfs_rq:/.load_avg.stddev >> 2540 ą 15% +23.5% 3137 ą 11% >> sched_debug.cfs_rq:/.removed.runnable_sum.avg >> 181.83 ą 11% -56.3% 79.50 ą 34% >> sched_debug.cfs_rq:/.runnable_load_avg.avg >> 16.46 ą 37% -72.9% 4.45 ą110% >> sched_debug.cfs_rq:/.runnable_load_avg.min >> 294.77 ą 5% +11.2% 327.87 ą 6% >> sched_debug.cfs_rq:/.util_avg.stddev >> 220260 ą 8% +20.3% 264870 ą 4% sched_debug.cpu.avg_idle.avg >> 502903 ą 4% +21.0% 608663 sched_debug.cpu.avg_idle.max >> 148667 ą 6% +29.5% 192468 ą 2% sched_debug.cpu.avg_idle.stddev >> 180.64 ą 10% -53.4% 84.23 ą 34% sched_debug.cpu.cpu_load[0].avg >> 25.73 ą 15% -85.6% 3.70 ą113% sched_debug.cpu.cpu_load[0].min >> 176.98 ą 6% -52.5% 84.06 ą 35% sched_debug.cpu.cpu_load[1].avg >> 53.93 ą 13% -72.6% 14.75 ą 15% sched_debug.cpu.cpu_load[1].min >> 176.61 ą 4% -55.3% 78.92 ą 31% sched_debug.cpu.cpu_load[2].avg >> 73.78 ą 11% -73.4% 19.61 ą 7% sched_debug.cpu.cpu_load[2].min >> 177.42 ą 3% -58.8% 73.09 ą 21% sched_debug.cpu.cpu_load[3].avg >> 93.01 ą 8% -73.9% 24.25 ą 6% sched_debug.cpu.cpu_load[3].min >> 173.36 ą 3% -60.6% 68.26 ą 13% sched_debug.cpu.cpu_load[4].avg >> 274.36 ą 5% -48.6% 141.16 ą 44% sched_debug.cpu.cpu_load[4].max >> 107.87 ą 6% -73.0% 29.11 ą 9% sched_debug.cpu.cpu_load[4].min >> 11203 ą 9% +9.9% 12314 ą 6% sched_debug.cpu.curr->pid.avg >> 1042556 ą 3% -6.9% 970165 ą 2% >> sched_debug.cpu.sched_goidle.max >> 748905 ą 5% -13.4% 648459 >> sched_debug.cpu.sched_goidle.min >> 90872 ą 11% +17.4% 106717 ą 5% >> sched_debug.cpu.sched_goidle.stddev >> 457847 ą 4% -15.0% 389113 sched_debug.cpu.ttwu_local.min >> 18.60 -1.1 17.45 >> perf-profile.calltrace.cycles-pp.secondary_startup_64 >> 16.33 ą 2% -1.0 15.29 >> perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64 >> 16.33 ą 2% -1.0 15.29 >> perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64 >> 16.32 ą 2% -1.0 15.29 >> perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 >> 15.44 ą 2% -1.0 14.43 >> perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary >> 15.69 ą 2% -1.0 14.71 >> perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 >> 5.54 -0.1 5.45 >> perf-profile.calltrace.cycles-pp.__libc_fork >> 10.28 +0.0 10.32 >> perf-profile.calltrace.cycles-pp.page_fault >> 10.16 +0.0 10.21 >> perf-profile.calltrace.cycles-pp.do_page_fault.page_fault >> 10.15 +0.1 10.20 >> perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault >> 9.47 +0.1 9.56 >> perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault >> 11.49 +0.1 11.59 >> perf-profile.calltrace.cycles-pp.sys_execve.do_syscall_64.return_from_SYSCALL_64.execve >> 8.28 +0.1 8.38 >> perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.do_execveat_common.sys_execve.do_syscall_64 >> 11.49 +0.1 11.59 >> perf-profile.calltrace.cycles-pp.return_from_SYSCALL_64.execve >> 11.49 +0.1 11.59 >> perf-profile.calltrace.cycles-pp.do_syscall_64.return_from_SYSCALL_64.execve >> 8.30 +0.1 8.41 >> perf-profile.calltrace.cycles-pp.search_binary_handler.do_execveat_common.sys_execve.do_syscall_64.return_from_SYSCALL_64 >> 11.46 +0.1 11.58 >> perf-profile.calltrace.cycles-pp.do_execveat_common.sys_execve.do_syscall_64.return_from_SYSCALL_64.execve >> 8.46 +0.1 8.57 >> perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault >> 5.21 +0.1 5.34 ą 2% >> perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__wake_up_parent >> 5.24 +0.1 5.38 ą 2% >> perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__wake_up_parent.entry_SYSCALL_64_fastpath >> 13.20 +0.1 13.34 >> perf-profile.calltrace.cycles-pp.execve >> 6.79 +0.2 6.94 ą 2% >> perf-profile.calltrace.cycles-pp.__wake_up_parent.entry_SYSCALL_64_fastpath >> 6.79 +0.2 6.95 ą 2% >> perf-profile.calltrace.cycles-pp.do_group_exit.__wake_up_parent.entry_SYSCALL_64_fastpath >> 6.78 +0.2 6.94 >> perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__wake_up_parent.entry_SYSCALL_64_fastpath >> 5.98 +0.2 6.18 >> perf-profile.calltrace.cycles-pp.vfprintf.__vsnprintf_chk >> 8.38 +0.2 8.61 >> perf-profile.calltrace.cycles-pp.__vsnprintf_chk >> 14.17 +0.3 14.49 >> perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath >> 18.60 -1.1 17.45 >> perf-profile.children.cycles-pp.do_idle >> 18.60 -1.1 17.45 >> perf-profile.children.cycles-pp.cpu_startup_entry >> 18.60 -1.1 17.45 >> perf-profile.children.cycles-pp.secondary_startup_64 >> 17.60 -1.1 16.46 >> perf-profile.children.cycles-pp.intel_idle >> 17.89 -1.1 16.80 >> perf-profile.children.cycles-pp.cpuidle_enter_state >> 16.33 ą 2% -1.0 15.29 >> perf-profile.children.cycles-pp.start_secondary >> 5.54 -0.1 5.45 >> perf-profile.children.cycles-pp.__libc_fork >> 16.15 +0.0 16.18 >> perf-profile.children.cycles-pp.do_page_fault >> 16.19 +0.0 16.22 >> perf-profile.children.cycles-pp.page_fault >> 6.24 +0.1 6.29 ą 2% >> perf-profile.children.cycles-pp.filemap_map_pages >> 16.07 +0.1 16.13 >> perf-profile.children.cycles-pp.__do_page_fault >> 16.85 +0.1 16.92 >> perf-profile.children.cycles-pp.do_syscall_64 >> 16.85 +0.1 16.92 >> perf-profile.children.cycles-pp.return_from_SYSCALL_64 >> 9.22 +0.1 9.33 >> perf-profile.children.cycles-pp.search_binary_handler >> 13.49 +0.1 13.61 >> perf-profile.children.cycles-pp.__handle_mm_fault >> 4.89 +0.1 5.02 ą 2% >> perf-profile.children.cycles-pp.unmap_page_range >> 9.11 +0.1 9.24 >> perf-profile.children.cycles-pp.load_elf_binary >> 13.20 +0.1 13.34 >> perf-profile.children.cycles-pp.execve >> 12.82 +0.1 12.96 >> perf-profile.children.cycles-pp.sys_execve >> 4.95 +0.2 5.10 ą 2% >> perf-profile.children.cycles-pp.unmap_vmas >> 12.79 +0.2 12.95 >> perf-profile.children.cycles-pp.do_execveat_common >> 13.90 +0.2 14.07 >> perf-profile.children.cycles-pp.handle_mm_fault >> 6.95 +0.2 7.13 ą 2% >> perf-profile.children.cycles-pp.do_exit >> 6.95 +0.2 7.13 ą 2% >> perf-profile.children.cycles-pp.do_group_exit >> 6.95 +0.2 7.13 ą 2% >> perf-profile.children.cycles-pp.__wake_up_parent >> 6.40 ą 2% +0.2 6.62 >> perf-profile.children.cycles-pp.vfprintf >> 8.38 +0.2 8.61 >> perf-profile.children.cycles-pp.__vsnprintf_chk >> 9.21 +0.2 9.46 >> perf-profile.children.cycles-pp.mmput >> 9.16 +0.2 9.41 >> perf-profile.children.cycles-pp.exit_mmap >> 19.85 +0.3 20.13 >> perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath >> 17.60 -1.1 16.46 >> perf-profile.self.cycles-pp.intel_idle >> 6.03 ą 2% +0.2 6.26 >> perf-profile.self.cycles-pp.vfprintf >> >> >> >> unixbench.score >> >> 14000 +-+-----------------------------------------------------------------+ >> O.O..O.O.O.O..O.O.O.O..O.O.O.O..O.O.O..O.O.O.+ +.+.+..+.+.+.+..+.| >> 12000 +-+ : : | >> | : : | >> 10000 +-+ : : | >> | : : | >> 8000 +-+ : : | >> | : : | >> 6000 +-+ : : | >> | : : | >> 4000 +-+ : : | >> | :: | >> 2000 +-+ : | >> | : | >> 0 +-+-----------------------------------------------------------------+ >> >> >> >> Disclaimer: >> Results have been estimated based on internal Intel analysis and are provided >> for informational purposes only. Any difference in system hardware or >> software >> design or configuration may affect actual performance. >> >> >> Thanks, >> Xiaolong