On 5/30/12 11:27 PM, Andrey Zonov wrote:
Hi,

I have long running process for which `ps -o usertime -p $pid' shows
always the same time - 6190:07.65, `ps -o cputime -p $pid' for the same
process continue to grow and now it's 21538:53.61. It looks like
overflow in resource usage code or something.


I reproduced that problem with attached program. I ran it with 23 threads on machine with 24 CPUs and after night I see this:

$ ps -o usertime,time -p 24134 && sleep 60 && ps -o usertime,time -p 24134
  USERTIME        TIME
6351:24.74 14977:35.19
  USERTIME        TIME
6351:24.74 15000:34.53

Per thread user-time counts correct:

$ ps -H -o usertime,time -p 24134
 USERTIME      TIME
  0:00.00   0:00.00
652:35.84 652:38.59
652:34.75 652:37.97
652:50.46 652:51.97
652:38.93 652:43.08
652:39.73 652:43.36
652:44.09 652:47.36
652:56.49 652:57.94
652:51.84 652:54.41
652:37.48 652:41.57
652:36.61 652:40.90
652:39.41 652:42.52
653:03.72 653:06.72
652:49.96 652:53.25
652:45.92 652:49.03
652:40.33 652:42.05
652:46.53 652:49.31
652:44.77 652:47.33
653:00.54 653:02.24
652:33.31 652:36.13
652:51.03 652:52.91
652:50.73 652:52.71
652:41.32 652:44.64
652:59.86 653:03.25

(kgdb) p $my->p_rux
$14 = {rux_runtime = 2171421985692826, rux_uticks = 114886093, rux_sticks = 8353, rux_iticks = 0, rux_uu = 381084736784, rux_su = 65773652, rux_tu = 904571706136}
(kgdb) p $my->p_rux
$15 = {rux_runtime = 2191831516209186, rux_uticks = 115966087, rux_sticks = 8444, rux_iticks = 0, rux_uu = 381084736784, rux_su = 66458587, rux_tu = 913099969825}

As you can see rux_uu stale, but rux_uticks still ticks. I think the problem is in calcru1(). This expression

uu = (tu * ut) / tt

overflows.

I applied the following patch:

Index: /usr/src/sys/kern/kern_resource.c
===================================================================
--- /usr/src/sys/kern/kern_resource.c   (revision 235394)
+++ /usr/src/sys/kern/kern_resource.c   (working copy)
@@ -885,7 +885,7 @@ calcru1(struct proc *p, struct rusage_ext *ruxp, s
     struct timeval *sp)
 {
        /* {user, system, interrupt, total} {ticks, usec}: */
-       uint64_t ut, uu, st, su, it, tt, tu;
+       uint64_t ut, uu, st, su, it, tt, tu, tmp;

        ut = ruxp->rux_uticks;
        st = ruxp->rux_sticks;
@@ -909,10 +909,20 @@ calcru1(struct proc *p, struct rusage_ext *ruxp, s
                 * The normal case, time increased.
                 * Enforce monotonicity of bucketed numbers.
                 */
-               uu = (tu * ut) / tt;
+               if (ut == 0)
+                       uu = 0;
+               else {
+                       tmp = tt / ut;
+                       uu = tmp ? tu / tmp : 0;
+               }
                if (uu < ruxp->rux_uu)
                        uu = ruxp->rux_uu;

and now ran test again.

--
Andrey Zonov
/*
 * Andrey Zonov (c) 2012
 */

#include <err.h>
#include <pthread.h>
#include <stdlib.h>

void *func(void *arg);

int
main(int argc, char **argv)
{
        int i;
        int threads;
        int *tid;
        pthread_t *tds;

        if (argc != 2)
                errx(1, "usage: usertime <threads>");

        threads = atoi(argv[1]);
        tid = malloc(sizeof(int) * threads);
        tds = malloc(sizeof(pthread_t) * threads);

        for (i = 0; i < threads; i++) {
                tid[i] = i;
                if (pthread_create(&tds[i], NULL, func, &tid[i]) != 0)
                        err(1, "pthread_create(%d)", i);
        }

        for (i = 0; i < threads; i++)
                if (pthread_join(tds[i], NULL) != 0)
                        err(1, "pthread_join(%d)", i);

        exit(0);
}

void *
func(void *arg __unused)
{
        int i;

#define MAX     (1<<20)

        for (i = 0; i < MAX; i++) {
                if ((i % (MAX - 1)) == 0) {
                        i = 0;
                        /*usleep(1);*/
                }
        }

        pthread_exit(NULL);
}
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to