On Wed, Aug 10, 2016 at 08:57:28PM +0200, Mike Galbraith wrote:
> sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
> 
> Roughly 10% of the time, ltp testcase getrusage04 fails:
> getrusage04    0  TINFO  :  Expected timers granularity is 4000 us
> getrusage04    0  TINFO  :  Using 1 as multiply factor for max [us]time 
> increment (1000+4000us)!
> getrusage04    0  TINFO  :  utime:           0us; stime:         179us
> getrusage04    0  TINFO  :  utime:        3751us; stime:           0us
> getrusage04    1  TFAIL  :  getrusage04.c:133: stime increased > 5000us:
> 
> If ->sum_exec_runtime has moved beyond the rtime of ->prev_cputime, but
> no time has as yet been accounted to the task, bail.
> 
> Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
> Signed-off-by: Mike Galbraith <umgwanakikb...@gmail.com>
> Cc: sta...@vger.kernel.org # 4.3+
> ---
>  kernel/sched/cputime.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -606,6 +606,13 @@ static void cputime_adjust(struct task_c
>       stime = curr->stime;
>       utime = curr->utime;
>  
> +     /*
> +      * sum_exec_runtime has moved, but nothing has yet been
> +      * accounted to the task, there's nothing to update.
> +      */
> +     if (utime + stime == 0)
> +             goto out;

urgh...

Valid scenario.. not sure about the solution though. This would mean the
task has _no_ running time if it forever dodges the tick, which would be
bad.

Does something like so cure things too?

---
 kernel/sched/cputime.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9858266fb0b3..2ee83b200504 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -614,19 +614,25 @@ static void cputime_adjust(struct task_cputime *curr,
        stime = curr->stime;
        utime = curr->utime;
 
-       if (utime == 0) {
-               stime = rtime;
+       /*
+        * If either stime or both stime and utime are 0, assume all runtime is
+        * userspace. Once a task gets some ticks, the monotonicy code at
+        * 'update' will ensure things converge to the observed ratio.
+        */
+       if (stime == 0) {
+               utime = rtime;
                goto update;
        }
 
-       if (stime == 0) {
-               utime = rtime;
+       if (utime == 0) {
+               stime = rtime;
                goto update;
        }
 
        stime = scale_stime((__force u64)stime, (__force u64)rtime,
                            (__force u64)(stime + utime));
 
+update:
        /*
         * Make sure stime doesn't go backwards; this preserves monotonicity
         * for utime because rtime is monotonic.
@@ -649,7 +655,6 @@ static void cputime_adjust(struct task_cputime *curr,
                stime = rtime - utime;
        }
 
-update:
        prev->stime = stime;
        prev->utime = utime;
 out:

Reply via email to