On Sat, Feb 07, 2015 at 03:02:11PM +0800, Zefan Li wrote:
> Before 8323f26ce342 "sched: Fix race in task_group()", task_group() of
> those RT tasks always return root_task_group, but the escape can't happen.

Ah, yes, I'm an idiot. I'm not sure what I was thinking, but I seemed
to have confused myself very well indeed.

> After commit 8323f26ce342, task_group always return root_task_group except
> for the case I showed:
> 
> 1. Change scheduling policy before setsid():
> 
>  # cat /proc/sched_debug | grep test
>  R           test  4194     24851.893077       945   120     24851.893077     
> 11196.482331         0.000000 /
> 
> 2. Change policy after setsid():
> 
>  R           test  4142      4962.517723       420   120      4962.517723     
>  4974.126149         0.000000 /autogroup-44

Yes, which of course is inconsistent as well, it really is in the
autogroup, regardless of its class.

> I think we can fix it with:
> 
> diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c
> index 8a2e230..8c3a169 100644
> --- a/kernel/sched/auto_group.c
> +++ b/kernel/sched/auto_group.c
> @@ -115,9 +115,6 @@ bool task_wants_autogroup(struct task_struct *p, struct 
> task_group *tg)
>       if (tg != &root_task_group)
>               return false;
>  
> -     if (p->sched_class != &fair_sched_class)
> -             return false;
> -

Yes.

> This is exactly what I did at first, but besides the issue described above,
> seems it might lead to starving RT tasks.
> 
> If there's some rt task in autogroups but none in root cgroup, it's allowed
> to set rt_runtime to 0, so I think we have to disallow this setting, like what
> we already do with global rt_runtime.

> @@ -7540,6 +7543,9 @@ static int __rt_schedulable(struct task_group *tg, u64 
> period, u64 runtime)
>               .rt_runtime = runtime,
>       };
>  
> +     if (tg == &root_task_group && runtime == 0)
> +             return -EINVAL;
> +

Indeed, setting runtime=0 for the root group is a very bad thing
regardless of this patch. It would disallow the kernel from creating RT
threads, which it needs for 'correct' operation in a number of cases.

But lets make that a separate patch.

So how about this?

---
Subject: sched, autogroup: Fix failure to set cpu.rt_runtime_us
From: Peter Zijlstra <pet...@infradead.org>
Date: Mon Feb  9 11:53:18 CET 2015

Because task_group() uses a cache of autogroup_task_group(), whoes
output depends on sched_class, switching classes can generate
problems.

In particular, when started as fair, the cache points to the
autogroup, so when switching to RT the tg_rt_schedulable() test fails
for every cpu.rt_{runtime,period}_us change because now the autogroup
has tasks and no runtime.

Furthermore, going back to the previous semantics of varying
task_group() with sched_class has the down-side that the sched_debug
output varies as well, even though the task really is in the
autogroup.

Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
both (all) task_group() usages in sched/core now have one. And remove
all the remnants of the variable task_group() output.

Cc: Mike Galbraith <umgwanakikb...@gmail.com>
Cc: Stefan Bader <stefan.ba...@canonical.com>
Reported-by: Zefan Li <lize...@huawei.com>
Fixes: 8323f26ce342 ("sched: Fix race in task_group()")
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
---
 kernel/sched/auto_group.c |    6 +-----
 kernel/sched/core.c       |    6 ++++++
 2 files changed, 7 insertions(+), 5 deletions(-)

--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -87,8 +87,7 @@ static inline struct autogroup *autogrou
         * so we don't have to move tasks around upon policy change,
         * or flail around trying to allocate bandwidth on the fly.
         * A bandwidth exception in __sched_setscheduler() allows
-        * the policy change to proceed.  Thereafter, task_group()
-        * returns &root_task_group, so zero bandwidth is required.
+        * the policy change to proceed.
         */
        free_rt_sched_group(tg);
        tg->rt_se = root_task_group.rt_se;
@@ -115,9 +114,6 @@ bool task_wants_autogroup(struct task_st
        if (tg != &root_task_group)
                return false;
 
-       if (p->sched_class != &fair_sched_class)
-               return false;
-
        /*
         * We can only assume the task group can't go away on us if
         * autogroup_move_group() can see us on ->thread_group list.
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7644,6 +7644,12 @@ static inline int tg_has_rt_tasks(struct
 {
        struct task_struct *g, *p;
 
+       /*
+        * Autogroups do not have RT tasks; see autogroup_create().
+        */
+       if (task_group_is_autogroup(tg))
+               return 0;
+
        for_each_process_thread(g, p) {
                if (rt_task(p) && task_group(p) == tg)
                        return 1;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to