ebied...@xmission.com (Eric W. Biederman) writes:

> Stephen Brennan <stephen.s.bren...@oracle.com> writes:
>
>> The pid_revalidate() function requires dropping from RCU into REF lookup
>> mode. When many threads are resolving paths within /proc in parallel,
>> this can result in heavy spinlock contention as each thread tries to
>> grab a reference to the /proc dentry lock (and drop it shortly
>> thereafter).
>
> I am feeling dense at the moment.  Which lock specifically are you
> referring to?  The only locks I can thinking of are sleeping locks,
> not spinlocks.

The lock in question is the d_lockref field (aliased as d_lock) of
struct dentry. It is contended in this code path while processing the
"/proc" dentry, switching from RCU to REF mode.

    walk_component()
      lookup_fast()
        d_revalidate()
          pid_revalidate() // returns -ECHILD
        unlazy_child()
          lockref_get_not_dead(&nd->path.dentry->d_lockref)

>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index ebea9501afb8..833d55a59e20 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -1830,19 +1846,22 @@ static int pid_revalidate(struct dentry *dentry, 
>> unsigned int flags)
>>  {
>>      struct inode *inode;
>>      struct task_struct *task;
>> +    int rv = 0;
>>  
>> -    if (flags & LOOKUP_RCU)
>> -            return -ECHILD;
>> -
>> -    inode = d_inode(dentry);
>> -    task = get_proc_task(inode);
>> -
>> -    if (task) {
>> -            pid_update_inode(task, inode);
>> -            put_task_struct(task);
>> -            return 1;
>> +    if (flags & LOOKUP_RCU) {
>
> Why do we need to test flags here at all?
> Why can't the code simply take an rcu_read_lock unconditionally and just
> pass flags into do_pid_update_inode?
>

I don't have any good reason. If it is safe to update the inode without
holding a reference to the task struct (or holding any other lock) then
I can consolidate the whole conditional.

>
>> +            inode = d_inode_rcu(dentry);
>> +            task = pid_task(proc_pid(inode), PIDTYPE_PID);
>> +            if (task)
>> +                    rv = do_pid_update_inode(task, inode, flags);
>> +    } else {
>> +            inode = d_inode(dentry);
>> +            task = get_proc_task(inode);
>> +            if (task) {
>> +                    rv = do_pid_update_inode(task, inode, flags);
>> +                    put_task_struct(task);
>> +            }
>
>>      }
>> -    return 0;
>> +    return rv;
>>  }
>>  
>>  static inline bool proc_inode_is_dead(struct inode *inode)
>
> Eric

Reply via email to