On 7/5/18 9:42 PM, Peter Zijlstra wrote: > On Thu, Jul 05, 2018 at 09:21:15PM +0800, Xunlei Pang wrote: >> On 7/5/18 6:46 PM, Peter Zijlstra wrote: >>> On Wed, Jun 27, 2018 at 08:22:42PM +0800, Xunlei Pang wrote: >>>> tick-based whole utime is utime_0, tick-based whole stime >>>> is stime_0, scheduler time is rtime_0. >>> >>>> For a long time, the process runs mainly in userspace with >>>> run-sleep patterns, and because two different clocks, it >>>> is possible to have the following condition: >>>> rtime_0 < utime_0 (as with little stime_0) >>> >>> I don't follow... what? >>> >>> Why are you, and why do you think it makes sense to, compare rtime_0 >>> against utime_0 ? >>> >>> The [us]time_0 are, per your earlier definition, ticks. They're not an >>> actual measure of time. Do not compare the two, that makes no bloody >>> sense. >>> >> >> [us]time_0 is task_struct:utime{stime}, I cited directly from >> cputime_adjust(), both in nanoseconds. I assumed "rtime_0 < utime_0" >> here to simple the following proof to help explain the problem we met. > > In the !VIRT_CPU_ACCOUNTING case they (task_struct::[us]time) are not > actual durations. Yes, the happen to be accounted in multiples of > TICK_NSEC and thereby happen to carry a [ns] unit, but they are not > durations, they are samples. > > (we just happen to store them in a [ns] unit because for > VIRT_CPU_ACCOUNTING they are in fact durations) > > If 'rtime < utime' is not a valid assumption to build a problem on for > !VIRT_CPU_ACCOUNTING. >
It is rtime < utime + stime, that is the imprecise tick-based run time may be larger than precise sum_exec_runtime scheduler-based run time, it can happen with some frequent run-sleep patterns. Because stime is usually very small, so it is possible to have rtime < utime. > > So please try again, so far you're not making any sense. > I also had a reproducer to verify this patch, can attach it tomorrow.