On Sun, Jan 27, 2008 at 09:01:15PM +0100, Guillaume Chazarain wrote: > I noticed some strangely high wake up latencies with FAIR_USER_SCHED > using this script:
<snip> > We have two busy loops with UID=1. > And UID=2 maintains the running median of its wake up latency. > I get these latencies: > > # ./sched.py > 4.300022 ms > 4.801178 ms > 4.604006 ms Given that sysctl_sched_wakeup_granularity is set to 10ms by default, this doesn't sound abnormal. <snip> > Disabling FAIR_USER_SCHED restores wake up latencies in the noise: > > # ./sched.py > -0.156975 ms > -0.067091 ms > -0.022984 ms The reason why we are getting better wakeup latencies for !FAIR_USER_SCHED is because of this snippet of code in place_entity(): if (!initial) { /* sleeps upto a single latency don't count. */ if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se)) ^^^^^^^^^^^^^^^^^^ vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ vruntime = max_vruntime(se->vruntime, vruntime); } NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. However I am not sure whether that is the way to go. We want to let one group of tasks running as much as possible until the fairness/wakeup-latency threshold is exceeded. If someone does want better wakeup latencies between groups too, they can always tune sysctl_sched_wakeup_granularity. <snip> > Strangely enough, another way to restore normal latencies is to change > setuid(2) to setuid(1), that is, putting the latency measurement in > the same group as the two busy loops. -- Regards, vatsa
--- kernel/sched_fair.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: current/kernel/sched_fair.c =================================================================== --- current.orig/kernel/sched_fair.c +++ current/kernel/sched_fair.c @@ -511,7 +511,7 @@ if (!initial) { /* sleeps upto a single latency don't count. */ - if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se)) + if (sched_feat(NEW_FAIR_SLEEPERS)) vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */