On 07/10/2013 10:40 AM, Michael Wang wrote:
> On 07/09/2013 07:51 PM, Bartlomiej Zolnierkiewicz wrote:
> [snip]
>>
>> It doesn't help and unfortunately it just can't help as it only
>> addresses lockdep functionality while the issue is not a lockdep
>> problem but a genuine locking problem. CPU hot-unplug invokes
>> _cpu_down() which calls cpu_hotplug_begin() which in turn takes
>> &cpu_hotplug.lock. The lock is then hold during __cpu_notify()
>> call. Notifier chain goes up to cpufreq_governor_dbs() which for
>> CPUFREQ_GOV_STOP event does gov_cancel_work(). This function
>> flushes pending work and waits for it to finish. The all above
>> happens in one kernel thread. At the same time the other kernel
>> thread is doing the work we are waiting to complete and it also
>> happens to do gov_queue_work() which calls get_online_cpus().
>> Then the code tries to take &cpu_hotplug.lock which is already
>> held by the first thread and deadlocks.
> 
> Hmm...I think I get your point, some thread hold the lock and
> flush some work which also try to hold the same lock, correct?
> 
> Ok, that's a problem, let's figure out a good way to solve it :)

And besides the patch from Srivatsa, I also suggest below fix, it's
try to really stop all the works during down notify, I'd like to know
how do you think about it ;-)

Regards,
Michael Wang

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index dc9b72e..a64b544 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct 
cpufreq_policy *policy,
 {
        int i;
 
+       if (dbs_data->queue_stop)
+               return;
+
        if (!all_cpus) {
                __gov_queue_work(smp_processor_id(), dbs_data, delay);
        } else {
-               get_online_cpus();
                for_each_cpu(i, policy->cpus)
                        __gov_queue_work(i, dbs_data, delay);
-               put_online_cpus();
        }
 }
 EXPORT_SYMBOL_GPL(gov_queue_work);
@@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data 
*dbs_data,
                struct cpufreq_policy *policy)
 {
        struct cpu_dbs_common_info *cdbs;
-       int i;
+       int i, round = 2;
 
+       dbs_data->queue_stop = 1;
+redo:
+       round--;
        for_each_cpu(i, policy->cpus) {
                cdbs = dbs_data->cdata->get_cpu_cdbs(i);
                cancel_delayed_work_sync(&cdbs->work);
        }
+
+       /*
+        * Since there is no lock to prvent re-queue the
+        * cancelled work, some early cancelled work might
+        * have been queued again by later cancelled work.
+        *
+        * Flush the work again with dbs_data->queue_stop
+        * enabled, this time there will be no survivors.
+        */
+       if (round)
+               goto redo;
+       dbs_data->queue_stop = 0;
 }
 
 /* Will return if we need to evaluate cpu load again or not */
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index e16a961..9116135 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -213,6 +213,7 @@ struct dbs_data {
        unsigned int min_sampling_rate;
        int usage_count;
        void *tuners;
+       int queue_stop;
 
        /* dbs_mutex protects dbs_enable in governor start/stop */
        struct mutex mutex;




> 
> Regards,
> Michael Wang
> 
> 
> 
> 
>>
>> Best regards,
>> --
>> Bartlomiej Zolnierkiewicz
>> Samsung R&D Institute Poland
>> Samsung Electronics
>>
>>> diff --git a/drivers/cpufreq/cpufreq_governor.c 
>>> b/drivers/cpufreq/cpufreq_governor.c
>>> index 5af40ad..aa05eaa 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>> @@ -229,6 +229,8 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
>>>         }
>>>  }
>>>  
>>> +static struct lock_class_key j_cdbs_key;
>>> +
>>>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>                 struct common_dbs_data *cdata, unsigned int event)
>>>  {
>>> @@ -366,6 +368,8 @@ int (struct cpufreq_policy *policy,
>>>                                         
>>> kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>>>  
>>>                         mutex_init(&j_cdbs->timer_mutex);
>>> +                       lockdep_set_class(&j_cdbs->timer_mutex, 
>>> &j_cdbs_key);
>>> +
>>>                         INIT_DEFERRABLE_WORK(&j_cdbs->work,
>>>                                              
>>> dbs_data->cdata->gov_dbs_timer);
>>>                 }
>>>
>>> Regards,
>>> Michael Wang
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to