On 12/07/2012 12:58 AM, Steven Rostedt wrote:
> On Fri, 2012-12-07 at 00:18 +0530, Srivatsa S. Bhat wrote:
>> On 12/06/2012 09:48 PM, Oleg Nesterov wrote:
>>> On 12/06, Srivatsa S. Bhat wrote:
>>>>
>>>> +void get_online_cpus_atomic(void)
>>>> +{
>>>> +  int c, old;
>>>> +
>>>> +  preempt_disable();
>>>> +  read_lock(&hotplug_rwlock);
>>>
>>> Confused... Why it also takes hotplug_rwlock?
>>
>> To avoid ABBA deadlocks.
>>
>> hotplug_rwlock was meant for the "light" readers.
>> The atomic counters were meant for the "heavy/full" readers.
>> I wanted them to be able to nest in any manner they wanted,
>> such as:
>>
>> Full inside light:
>>
>> get_online_cpus_atomic_light()
>>      ...
>>      get_online_cpus_atomic_full()
>>      ...
>>      put_online_cpus_atomic_full()
>>      ...
>> put_online_cpus_atomic_light()
>>
>> Or, light inside full:
>>
>> get_online_cpus_atomic_full()
>>      ...
>>      get_online_cpus_atomic_light()
>>      ...
>>      put_online_cpus_atomic_light()
>>      ...
>> put_online_cpus_atomic_full()
>>
>> To allow this, I made the two sets of APIs take the locks
>> in the same order internally.
>>
>> (I had some more description of this logic in the changelog
>> of 2/10; the only difference there is that instead of atomic
>> counters, I used rwlocks for the full-readers as well.
>> https://lkml.org/lkml/2012/12/5/320)
>>
> 
> You know reader locks can deadlock with each other, right? And this
> isn't caught be lockdep yet. This is because rwlocks have been made to
> be fair with writers. Before writers could be starved if a CPU always
> let a reader in. Now if a writer is waiting, a reader will block behind
> the writer. But this has introduced new issues with the kernel as
> follows:
> 
> 
>    CPU0                          CPU1            CPU2            CPU3
>    ----                          ----            ----            ----
> read_lock(A);
>                       read_lock(B)
>                                       write_lock(A) <- block
>                                                       write_lock(B) <- block
> read_lock(B) <-block
> 
>                       read_lock(A) <- block
> 
> DEADLOCK!
> 

The root-cause of this deadlock is again lock-ordering mismatch right?
CPU0 takes locks in order A, B
CPU1 takes locks in order B, A

And the writer facilitates in actually getting deadlocked.

I avoid this in this patchset by always taking the locks in the same
order. So we won't be deadlocking like this.

Regards,
Srivatsa S. Bhat


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to