I've done a lot of looking at this today. It feels like the problem may lie in the process scheduler. When I pin the CPU burning process to CPU0 (through "taskset -pc 0 $pid_printed_by_a_out"), and pin a bash shell also to CPU0, I see failure of the bash process to wake after sleeping (i.e., it's runnable, but CFS isn't giving it time). I've seen the bash process start to be scheduled after around 3 minutes, and I've also seen it just sit there.
Every time I've seen a scheduler debug trace (triggered via "echo w > /proc/sysrq-trigger"), there have been other runnable processes on the spinning CPU that don't seem to be getting scheduled at all. I've not been able to reproduce this problem on the kernel used in the Amazon Linux AMI (currently 2.6.34.7). This is in line with other user's observations (http://twitter.com/#!/synack/status/30415380321140737). I think that Canonical might need to look into what (if any) changes they've made to CFS in the 10.04 kernel tree. It's also possible that improvements have been made in CFS between 2.6.32 and 2.6.34 that account for better performance. -- You received this bug notification because you are a member of Ubuntu Bugs, which is a direct subscriber. https://bugs.launchpad.net/bugs/708920 Title: Strange 'fork/clone' blocking behavior under high cpu usage on EC2 -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs