>> I am just curious... can you reproduce the problem reliably? If yes, can you 
>> try
>> the patch below ? Just in case, this is not the real fix in any case...
>
> Yes. It deterministically results in hung processes in vanilla kernel.
> I'll try this patch.

I'll have to correct this. I can reproduce this issue easily on
high-end servers and normal laptops. But for some reason it does not
happen very often in vmware guests (maybe related to lower
parallelism).

>> --- x/kernel/sched/core.c
>> +++ x/kernel/sched/core.c
>> @@ -2793,8 +2793,11 @@ asmlinkage __visible void schedule_tail(struct 
>> task_struct *prev)
>>         balance_callback(rq);
>>         preempt_enable();
>>
>> -       if (current->set_child_tid)
>> +       if (current->set_child_tid) {
>> +               mem_cgroup_oom_enable();
>>                 put_user(task_pid_vnr(current), current->set_child_tid);
>> +               mem_cgroup_oom_disable();
>> +       }
>>  }
>>
>>  /*

I tried this patch and I still see the same stuck processes (assuming
that's what you were curious about).

Reply via email to