> I think tenant will have per core weight, similar to sched entity's per > cpu weight. The tenant's per core weight could derive from its > corresponding taskgroup's per cpu sched entities' weight(sum them up > perhaps). Tenant with higher weight will have its core wide vruntime > advance slower than tenant with lower weight. Does this address the > issue here? > I think that makes sense. Should work. We should also consider how to classify untagged processes so that they are not starved .
> > Care to elaborate the idea of coresched idle thread concept? > How it solved the hyperthread going idle problem and what the accounting > issues and wakeup issues are, etc. > So we have one coresched_idle thread per cpu and when a sibling cannot find a match, instead of forcing idle, we schedule this new thread. Ideally this thread would be similar to idle, but scheduler doesn't now confuse idle cpu with a forced idle state. This also invokes schedule() as vruntime progresses(alternative to your 3rd patch) and vruntime accounting gets more consistent. There are special cases that need to be handled so that coresched_idle never gets scheduled in the normal scheduling path(without coresched) etc. Hope this clarifies. But as Peter suggested, if we can differentiate idle from forced idle in the idle thread and account for the vruntime progress, that would be a better approach. Thanks, Vineeth