On 17/11/20 12:52, Valentin Schneider wrote:
> On 17/11/20 09:46, Peter Zijlstra wrote:
>> How's this then? It still doesn't explicitly call out the specific race,
>> but does mention the more fundamental issue that wakelist queueing
>> doesn't respect the regular rules anymore.
>>
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -775,7 +775,6 @@ struct task_struct {
>>      unsigned                        sched_reset_on_fork:1;
>>      unsigned                        sched_contributes_to_load:1;
>>      unsigned                        sched_migrated:1;
>> -    unsigned                        sched_remote_wakeup:1;
>>  #ifdef CONFIG_PSI
>>      unsigned                        sched_psi_wake_requeue:1;
>>  #endif
>> @@ -785,6 +784,21 @@ struct task_struct {
>>  
>>      /* Unserialized, strictly 'current' */
>>  
>> +    /*
>> +     * This field must not be in the scheduler word above due to wakelist
>> +     * queueing no longer being serialized by p->on_cpu. However:
>> +     *
>> +     * p->XXX = X;                  ttwu()
>> +     * schedule()                     if (p->on_rq && ..) // false
>> +     *   smp_mb__after_spinlock();    if (smp_load_acquire(&p->on_cpu) && 
>> //true
>> +     *   deactivate_task()                ttwu_queue_wakelist())
>> +     *     p->on_rq = 0;                    p->sched_remote_wakeup = Y;
>> +     *
>> +     * guarantees all stores of 'current' are visible before
>> +     * ->sched_remote_wakeup gets used, so it can be in this word.
>> +     */
>
> Isn't the control dep between that ttwu() p->on_rq read and
> p->sched_remote_wakeup write "sufficient"?

smp_acquire__after_ctrl_dep() that is, since we need
  ->on_rq load => 'current' bits load + store

> That should be giving the right
> ordering for the rest of ttwu() wrt. those 'current' bits, considering they
> are written before that smp_mb__after_spinlock().
>
> In any case, consider me convinced:
>
> Reviewed-by: Valentin Schneider <valentin.schnei...@arm.com>
>
>> +    unsigned                        sched_remote_wakeup:1;
>> +
>>      /* Bit to tell LSMs we're in execve(): */
>>      unsigned                        in_execve:1;
>>      unsigned                        in_iowait:1;

Reply via email to