On Mon, Apr 7, 2014 at 8:17 PM, Frederic Weisbecker <fweis...@gmail.com> wrote: > The following example displays all the nonsense of that stat: > > CPU 0 CPU 1 > > task A block on IO ... > task B runs for 1 min ... > task A completes IO > > So in the above we've been waiting on IO for 1 minute. But none of that > have been accounted.
If there is task B which can put CPU to use while task A waits for IO, then *system performance is not IO bound*. > OTOH if task B were to run on CPU 1 (it could have, > really here this is about scheduler load balancing internals, hence pure > randomness for the user), the iowait time would have been accounted. Case A: overall system stats are: 50% busy, 50% idle Case B: overall system stats are: 50% busy, 50% iowait You are right, this does not look correct. Lets step back and look at the situation from a high-level POV. I believe I have a solution for this problem. Let's say we have a heavily loaded file server machine where CPUs are busy only 5% of the time. It makes sense to say that machine as a whole is "95% waiting for IO". Our existing accounting did exactly that for single-CPU machines. But for, say, 2-CPU machine it can show 5% busy, 45% iowait, 50% idle if there is only one task reading files, or 5% busy, 95% iowait if there are more than one task. But it's wrong! NONE of the CPUs are "idle" as long as there even one task blocked on IO. The machine is still IO-bound, not idling. In my example, it should not matter whether the machine has one or 64 CPUs, it should show 5% busy, 95% iowait overall state in both cases. Does the above make sense to you? My proposal is to count each CPU's time towards iowait if there are task(s) blocked on IO, *regardless on which runqueue they are*. Only if there are none, then time is counted towards idle. > I doubt that users are interested in such random accounting. They want > to know either: > > 1) how much time was spent waiting on IO by the whole system Hmm, I think I just said the same thing :) > 2) how much time was spent waiting on IO per task > 3) how much time was spent waiting on IO per CPU that initiated > IOs, or per CPU which ran task completing IOs. In order to have > an overview on where these mostly happened. Some people may want to know these things, and I am not objecting to adding whatever counters to help with that. -- vda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/