FYI: There are a pair of kernel commits that *might* solve this issue, at least for newer kernels. The first can be found in Linus's tree (as well as linux-next) and the second is currently only in linux-next:
[scroll@scroll-kernel-vm linux-next]$ git remote -v origin http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (fetch) origin http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (push) [scroll@scroll-kernel-vm linux-next]$ git log -n1 bea6832cc8c4a0a9a65dd17da6aaa657fe27bc3e commit bea6832cc8c4a0a9a65dd17da6aaa657fe27bc3e Author: Stanislaw Gruszka <sgrus...@redhat.com> Date: Wed Aug 8 11:27:15 2012 +0200 sched: fix divide by zero at {thread_group,task}_times On architectures where cputime_t is 64 bit type, is possible to trigger divide by zero on do_div(temp, (__force u32) total) line, if total is a non zero number but has lower 32 bit's zeroed. Removing casting is not a good solution since some do_div() implementations do cast to u32 internally. This problem can be triggered in practice on very long lived processes: PID: 2331 TASK: ffff880472814b00 CPU: 2 COMMAND: "oraagent.bin" #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2 #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00 #3 [ffff880472a51cd0] die at ffffffff8100f26b #4 [ffff880472a51d00] do_trap at ffffffff814f03f4 #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff #6 [ffff880472a51e00] divide_error at ffffffff8100be7b [exception RIP: thread_group_times+0x56] RIP: ffffffff81056a16 RSP: ffff880472a51eb8 RFLAGS: 00010046 RAX: bc3572c9fe12d194 RBX: ffff880874150800 RCX: 0000000110266fad RDX: 0000000000000000 RSI: ffff880472a51eb8 RDI: 001038ae7d9633dc RBP: ffff880472a51ef8 R8: 00000000b10a3a64 R9: ffff880874150800 R10: 00007fcba27ab680 R11: 0000000000000202 R12: ffff880472a51f08 R13: ffff880472a51f10 R14: 0000000000000000 R15: 0000000000000007 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d #8 [ffff880472a51f40] sys_times at ffffffff81088524 #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2 RIP: 0000003808caac3a RSP: 00007fcba27ab6d8 RFLAGS: 00000202 RAX: 0000000000000064 RBX: ffffffff8100b0f2 RCX: 0000000000000000 RDX: 00007fcba27ab6e0 RSI: 000000000076d58e RDI: 00007fcba27ab6e0 RBP: 00007fcba27ab700 R8: 0000000000000020 R9: 000000000000091b R10: 00007fcba27ab680 R11: 0000000000000202 R12: 00007fff9ca41940 R13: 0000000000000000 R14: 00007fcba27ac9c0 R15: 00007fff9ca41940 ORIG_RAX: 0000000000000064 CS: 0033 SS: 002b Cc: sta...@vger.kernel.org Signed-off-by: Stanislaw Gruszka <sgrus...@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijls...@chello.nl> Link: http://lkml.kernel.org/r/20120808092714.ga3...@redhat.com Signed-off-by: Thomas Gleixner <t...@linutronix.de> [scroll@scroll-kernel-vm linux-next]$ git log -n1 62188451f0d63add7ad0cd2a1ae269d600c1663d commit 62188451f0d63add7ad0cd2a1ae269d600c1663d Author: Frederic Weisbecker <fweis...@gmail.com> Date: Sat Jan 26 17:19:42 2013 +0100 cputime: Avoid multiplication overflow on utime scaling We scale stime, utime values based on rtime (sum_exec_runtime converted to jiffies). During scaling we multiple rtime * utime, which seems to be fine, since both values are converted to u64, but it's not. Let assume HZ is 1000 - 1ms tick. Process consist of 64 threads, run for 1 day, threads utilize 100% cpu on user space. Machine has 64 cpus. Process rtime = utime will be 64 * 24 * 60 * 60 * 1000 jiffies, which is 0x149970000. Multiplication rtime * utime result is 0x1a855771100000000, which can not be covered in 64 bits. Result of overflow is stall of utime values visible in user space (prev_utime in kernel), even if application still consume lot of CPU time. A solution to solve this is to perform the multiplication on stime instead of utime. It's easy to grow the utime value fast with a CPU bound thread in userspace for example. Now we assume that doing so with stime is much harder. In most cases a task shouldn't ever spend much time in kernel space as it tends to sleep waiting for jobs completion when they take long to achieve. IO is the typical example of that. Hence scaling the cputime by performing the multiplication on stime instead of utime should considerably reduce the chances of an overflow on most workloads. This is largely inspired by a patch from Stanislaw Gruszka: http://lkml.kernel.org/r/20130107113144.ga7...@redhat.com Inspired-by: Stanislaw Gruszka <sgrus...@redhat.com> Reported-by: Stanislaw Gruszka <sgrus...@redhat.com> Acked-by: Stanislaw Gruszka <sgrus...@redhat.com> Signed-off-by: Frederic Weisbecker <fweis...@gmail.com> Cc: Oleg Nesterov <o...@redhat.com> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Andrew Morton <a...@linux-foundation.org> Link: http://lkml.kernel.org/r/1359217182-25184-1-git-send-email-fweis...@gmail.com Signed-off-by: Ingo Molnar <mi...@kernel.org> -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1023214 Title: bogus utime and stime in /proc/<PID/stat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1023214/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs