On Sun, Aug 10, 2014 at 06:54:13PM +0800, Fengguang Wu wrote:
> This view may be easier to read, by grouping the metrics by test case.
> 
> test case: brickland1/aim7/6000-page_test

OK, I have a similar system to the brickland thing (slightly different
configuration, but should be close enough).

Now; do you have a description of each test-case someplace? In
particular, it might be good to have a small annotation to show which
direction is better.

> 
>     128529 ± 1%     +17.9%     151594 ± 0%  TOTAL aim7.jobs-per-min

jobs per minute, + is better, so no worries there.

>     582269 ±14%     -55.6%     258617 ±16%  TOTAL softirqs.SCHED
>     993654 ± 2%     -19.9%     795962 ± 3%  TOTAL softirqs.RCU
>   15865125 ± 1%     -15.0%   13485882 ± 1%  TOTAL softirqs.TIMER

>   59366697 ± 3%     -46.1%   32017187 ± 7%  TOTAL cpuidle.C1-IVT.time
>      54543 ±11%     -37.2%      34252 ±16%  TOTAL cpuidle.C1-IVT.usage
>      19542 ± 9%     -38.3%      12057 ± 4%  TOTAL cpuidle.C1E-IVT.usage
>   49527464 ± 6%     -32.4%   33488833 ± 4%  TOTAL cpuidle.C1E-IVT.time
>      76064 ± 3%     -32.2%      51572 ± 6%  TOTAL cpuidle.C6-IVT.usage

Less idle time; might be good, if the work is cpubound, might be bad if
not; hard to say.

>       2.82 ± 3%     +21.9%       3.43 ± 4%  TOTAL turbostat.%pc2
>       4.40 ± 2%     +22.0%       5.37 ± 4%  TOTAL turbostat.%c6
>      15.75 ± 1%      -3.4%      15.21 ± 0%  TOTAL turbostat.RAM_W

>    3150464 ± 2%     -24.2%    2387551 ± 3%  TOTAL 
> time.voluntary_context_switches

Typically less ctxsw is better..

>        281 ± 1%     -15.1%        238 ± 0%  TOTAL time.elapsed_time
>      29294 ± 1%     -14.3%      25093 ± 0%  TOTAL time.system_time

Less time spend (on presumably the same work) is better

>    4529818 ± 1%      -8.8%    4129398 ± 1%  TOTAL 
> time.involuntary_context_switches

Less preemptions, also generally better

>      10655 ± 0%      +1.4%      10802 ± 0%  TOTAL 
> time.percent_of_cpu_this_job_got

Seem an improvement; not sure.

Many more stats.. but from the above it looks like its an overall 'win';
or am I reading the thing wrong?


Now I think I see why this is; we've reduced load balancing frequency
significantly on this machine due to:


-#define SD_SIBLING_INIT (struct sched_domain) {                                
\
-       .min_interval           = 1,                                    \
-       .max_interval           = 2,                                    \


-#define SD_MC_INIT (struct sched_domain) {                             \
-       .min_interval           = 1,                                    \
-       .max_interval           = 4,                                    \


-#define SD_CPU_INIT (struct sched_domain) {                            \
-       .min_interval           = 1,                                    \
-       .max_interval           = 4,                                    \


        *sd = (struct sched_domain){
                .min_interval           = sd_weight,
                .max_interval           = 2*sd_weight,

Which both increased the min and max value significantly for all domains
involved.

That said; I think we might want to do something like the below; I can
imagine decreasing load balancing too much will negatively impact other
workloads.

Maybe slightly modified to make sure the first domain has a min_interval
of 1.

---
 kernel/sched/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1211575a2208..67ed5d854da1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6049,8 +6049,8 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
                sd_flags &= ~TOPOLOGY_SD_FLAGS;
 
        *sd = (struct sched_domain){
-               .min_interval           = sd_weight,
-               .max_interval           = 2*sd_weight,
+               .min_interval           = max(1, sd_weight/2),
+               .max_interval           = sd_weight,
                .busy_factor            = 32,
                .imbalance_pct          = 125,
 
@@ -6076,7 +6076,7 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
                                        ,
 
                .last_balance           = jiffies,
-               .balance_interval       = sd_weight,
+               .balance_interval       = max(1, sd_weight/2),
                .smt_gain               = 0,
                .max_newidle_lb_cost    = 0,
                .next_decay_max_lb_cost = jiffies,

Attachment: pgpq9PBTeWX_D.pgp
Description: PGP signature

Reply via email to