On Thu, Jul 10, 2014 at 10:06:27AM -0700, bseg...@google.com wrote: > So, sched_clock(_cpu) can be arbitrarily far off of cfs_rq_clock_task, so you > can't really do that. Ideally, yes, you would account for any time since > the last update and account that time as !runnable. However, I don't > think there is any good way to do that, and the current code doesn't.
Yeah. We only catch up the migrating task to its cfs_rq and substract. No catching up to "current" time. > > > > I made another mistake. Should not only track task entity load, group entity > > (as an entity) is also needed. Otherwise, task_h_load can't be done > > correctly... > > Sorry for the messup. But this won't make much change in the codes. > > This will increase it to 2x __update_load_avg per cgroup per > enqueue/dequeue. What does this (and this patch in general) do to > context switch cost at cgroup depth 1/2/3? We can update cfs_rq load_avg, and let the cfs_rq's own se take a ride in that update. These two should get exactly synchronized anyway (group se's load is only usefull for task_h_load calc, and group cfs_rq's load is useful for task_h_load and update_cfs_share calc). And technically, it looks easy: To update cfs_rq, the update weight is cfs_rq->load.weight To update its se, the update weight is cfs_rq->tg->se[cpu]->load.weight * on_rq So the it will not increase to 2x, but 1.05x, maybe, :) Thanks, Yuyang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/