On Thu, Sep 12, 2019 at 8:35 AM Aaron Lu <aaron...@linux.alibaba.com> wrote: > > > > I think comparing parent's runtime also will have issues once > > the task group has a lot more threads with different running > > patterns. One example is a task group with lot of active threads > > and a thread with fairly less activity. So when this less active > > thread is competing with a thread in another group, there is a > > chance that it loses continuously for a while until the other > > group catches up on its vruntime. > > I actually think this is expected behaviour. > > Without core scheduling, when deciding which task to run, we will first > decide which "se" to run from the CPU's root level cfs runqueue and then > go downwards. Let's call the chosen se on the root level cfs runqueue > the winner se. Then with core scheduling, we will also need compare the > two winner "se"s of each hyperthread and choose the core wide winner "se". > Sorry, I misunderstood the fix and I did not initially see the core wide min_vruntime that you tried to maintain in the rq->core. This approach seems reasonable. I think we can fix the potential starvation that you mentioned in the comment by adjusting for the difference in all the children cfs_rq when we set the minvruntime in rq->core. Since we take the lock for both the queues, it should be doable and I am trying to see how we can best do that.
> > > > As discussed during LPC, probably start thinking along the lines > > of global vruntime or core wide vruntime to fix the vruntime > > comparison issue? > > core wide vruntime makes sense when there are multiple tasks of > different cgroups queued on the same core. e.g. when there are two > tasks of cgroupA and one task of cgroupB are queued on the same core, > assume cgroupA's one task is on one hyperthread and its other task is on > the other hyperthread with cgroupB's task. With my current > implementation or Tim's, cgroupA will get more time than cgroupB. If we > maintain core wide vruntime for cgroupA and cgroupB, we should be able > to maintain fairness between cgroups on this core. Tim propose to solve > this problem by doing some kind of load balancing if I'm not mistaken, I > haven't taken a look at this yet. I think your fix is almost close to maintaining a core wide vruntime as you have a single minvruntime to compare now across the siblings in the core. To make the fix complete, we might need to adjust the whole tree's min_vruntime and I think its doable. Thanks, Vineeth