On Fri, 14 Oct 2005, Poul-Henning Kamp wrote:

In message <[EMAIL PROTECTED]>, Bruce Evans writes:
The timestamps in mi_switch() are taken on the same CPU and only their
differences are used, so they don't even need to be synced.  It they
use the TSC, then the TSCs just need to have the same almost-constant
frequency (or different almost-constant frequencies if timecounters
werre per-CPU).

Actually, I think we need to go back a step further.

The task of the scheduler is to hand out a finite resource according
to a set policy.

The finite resource is "instructions executed by a CPU".

It used to be that CPUs ran at constant clock rates, and therefore
implementors made the simplifying assumption that

        instructions = a * time

for some random but constant a and made their scheduling decisions
based on time.

This is currently moot for p_runtime.  p_runtime is not used for at
least kernel scheduling.  It is only used by userland (mostly for users
to look at?).  Schedulers use only ticks set periodically by sched_clock().
They should use p_runtime, given that we already pay the enormous cost
of setting it on every normal interrupt.

Today CPUs do not run on constant rates but they have counters which
count the number of instruction cycles.  Therefore talking about
computer effort in terms of "CPU second" is like selling rubber
band by the inch.

The scheduler has a side job of accounting for CPU usage and the
API for accesing this info has unfortunately been specified in
terms of time rather than instructions.

I disagree.  Time is the only useful metric for users, and scheduling
is fuzzy so it doesn't really care.  Scheduling needs an approximation
resource usage that can be obtained very efficiently.  Its tick counts
are very efficient and are precise enough even with a 100Hz period,
but they aren't accurate enough since applications can hide from
statistics clock ticks either accidentallly or intentionally.  statclock
was supposed to fix this but a never really did, especially with too-large
values of HZ like 1000 -- with hz > stathz it is easy to use a periodic
itimer to arrange to run about (hz - stathz) / hz of the time without ever
seeing a statclock tick.

Bruce
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to