On Thu, Sep 07, 2017 at 07:26:23PM +0530, Prateek Sood wrote:
> Remove circular dependency deadlock in a scenario where hotplug of CPU is
> being done while there is updation in cgroup and cpuset triggered from
> userspace.

You've forgotten to mention your solution to the deadlock, namely
inverting cpuset_mutex and cpu_hotplug_lock.

> Signed-off-by: Prateek Sood <prs...@codeaurora.org>
> ---
>  kernel/cgroup/cpuset.c | 32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 2f4039b..60dc0ac 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -816,16 +816,15 @@ static int generate_sched_domains(cpumask_var_t 
> **domains,
>   * 'cpus' is removed, then call this routine to rebuild the
>   * scheduler's dynamic sched domains.
>   *
> - * Call with cpuset_mutex held.  Takes get_online_cpus().
>   */
> -static void rebuild_sched_domains_locked(void)
> +static void rebuild_sched_domains_cpuslocked(void)
>  {
>       struct sched_domain_attr *attr;
>       cpumask_var_t *doms;
>       int ndoms;
>  
> +     lockdep_assert_cpus_held();
>       lockdep_assert_held(&cpuset_mutex);
> -     get_online_cpus();
>  
>       /*
>        * We have raced with CPU hotplug. Don't do anything to avoid
> @@ -833,27 +832,27 @@ static void rebuild_sched_domains_locked(void)
>        * Anyways, hotplug work item will rebuild sched domains.
>        */
>       if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
> -             goto out;
> +             return;
>  
>       /* Generate domain masks and attrs */
>       ndoms = generate_sched_domains(&doms, &attr);
>  
>       /* Have scheduler rebuild the domains */
>       partition_sched_domains(ndoms, doms, attr);
> -out:
> -     put_online_cpus();
>  }
>  #else /* !CONFIG_SMP */
> -static void rebuild_sched_domains_locked(void)
> +static void rebuild_sched_domains_cpuslocked(void)
>  {
>  }
>  #endif /* CONFIG_SMP */
>  
>  void rebuild_sched_domains(void)
>  {
> +     get_online_cpus();
>       mutex_lock(&cpuset_mutex);
> -     rebuild_sched_domains_locked();
> +     rebuild_sched_domains_cpuslocked();
>       mutex_unlock(&cpuset_mutex);
> +     put_online_cpus();
>  }

But if you invert these locks, the need for cpuset_hotplug_workfn() goes
away, at least for the CPU part, and we can make in synchronous again.
Yay!!

Also, I think new code should use cpus_read_lock() instead of
get_online_cpus().

Reply via email to