Peter Zijlstra <pet...@infradead.org> writes: > On Wed, Jul 09, 2014 at 12:07:08PM -0700, bseg...@google.com wrote: >> Peter Zijlstra <pet...@infradead.org> writes: >> >> > On Wed, Jul 09, 2014 at 09:07:53AM +0800, Yuyang Du wrote: >> >> That is chalenging... Can someone (Peter) grant us a lock of the remote >> >> rq? :) >> > >> > Nope :-).. we got rid of that lock for a good reason. >> > >> > Also, this is one area where I feel performance really trumps >> > correctness, we can fudge the blocked load a little. So the >> > sched_clock_cpu() difference is a strict upper bound on the >> > rq_clock_task() difference (and under 'normal' circumstances shouldn't >> > be much off). >> >> Well, unless IRQ_TIME_ACCOUNTING or such is on, in which case you lose. >> Or am I misunderstanding the suggestion? > > If its on its still an upper bound, and typically the difference is not > too large I think. > > Since clock_task is the regular clock minus some local amount, the > difference between two regular clock reads is always a strict upper > bound on clock_task differences. > >> Actually the simplest thing >> would probably be to grab last_update_time (which on 32-bit could be >> done with the _copy hack) and use that. Then I think the accuracy is >> only worse than current in that you can lose runnable load as well as >> blocked load, and that it isn't as easily corrected - currently if the >> blocked tasks wake up they'll add the correct numbers to >> runnable_load_avg, even if blocked_load_avg is screwed up and hit zero. >> This code would have to wait until it stabilized again. > > The problem with that is that last_update_time is measured in > clock_task, and you cannot transfer these values between CPUs. > clock_task can drift unbounded between CPUs.
Yes, but we don't need to - we just use the remote last_update_time to do a final update on p->se.avg, and then subtract that from cfs_rq->avg with atomics (and then set p->se.avg.last_update_time to 0 as now). This throws away any time since last_update_time, but that's no worse than current, which throws away any time since decay_counter, and they're both called from enqueue/dequeue/tick/update_blocked_averages. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/