On Fri, Aug 24, 2018 at 02:24:48PM -0700, Steve Muckle wrote: > On 08/24/2018 02:47 AM, Peter Zijlstra wrote: > > > > On 08/17/2018 11:27 AM, Steve Muckle wrote: > > > > > > > When rt_mutex_setprio changes a task's scheduling class to RT, > > > > > we're seeing cases where the task's vruntime is not updated > > > > > correctly upon return to the fair class. > > > > > > > Specifically, the following is being observed: > > > > > - task is deactivated while still in the fair class > > > > > - task is boosted to RT via rt_mutex_setprio, which changes > > > > > the task to RT and calls check_class_changed. > > > > > - check_class_changed leads to detach_task_cfs_rq, at which point > > > > > the vruntime_normalized check sees that the task's state is > > > > > TASK_WAKING, > > > > > which results in skipping the subtraction of the rq's min_vruntime > > > > > from the task's vruntime > > > > > - later, when the prio is deboosted and the task is moved back > > > > > to the fair class, the fair rq's min_vruntime is added to > > > > > the task's vruntime, even though it wasn't subtracted earlier. > > > > I'm thinking that is an incomplete scenario; where do we get to > > TASK_WAKING. > > Yes there's a missing bit of context here at the beginning that the task to > be boosted had already been put into TASK_WAKING.
See, I'm confused... The only time TASK_WAKING is visible, is if we've done a remote wakeup and it's 'stuck' on the remote wake_list. And in that case we've done migrate_task_rq_fair() on it. So by the time either rt_mutex_setprio() or __sched_setscheduler() get to calling check_class_changed(), under both pi_lock and rq->lock, the vruntime_normalized() thing should be right. So please detail the exact scenario. Because I'm not seeing it.