Hello, I have some trouble with a multithreaded java network server running on SLES10. At random times I see the kernel take 80% of the CPU leaving iddle to 0% for 30 seconds. After this period the system returns to normal operation state.
Below is a vmstat -a 3 recording that shows the problem: 1 0 0 773068 529184 693048 0 0 0 0 272 201 0 0 100 0 0 0 0 0 773068 529184 693064 0 0 0 25 317 334 1 0 99 1 0 0 0 0 772944 529216 693248 0 0 0 24 477 1017 3 0 96 0 0 0 0 0 772820 529256 693316 0 0 0 0 525 1376 4 1 95 0 0 0 0 0 772448 529344 693636 0 0 0 107 1098 3306 11 2 86 0 0 0 0 0 772324 529404 693456 0 0 0 0 723 2247 7 2 91 0 0 0 0 0 772076 529496 693656 0 0 0 132 770 2488 7 2 91 1 0 0 0 0 772200 529528 693608 0 0 0 91 528 1168 4 1 94 1 0 0 0 0 772200 529532 693728 0 0 0 0 334 387 1 0 99 0 0 0 0 0 772076 529568 693680 0 0 0 24 564 1250 4 1 95 0 0 0 0 0 771828 529636 693784 0 0 0 0 787 2144 7 2 91 0 0 0 0 0 771580 529744 694232 0 0 0 111 995 3081 11 2 86 1 0 107 0 0 771316 529792 694904 0 0 0 153 829 1650 12 37 51 0 0 113 0 0 771316 529792 694912 0 0 0 0 323 169 15 85 0 0 0 116 0 0 771216 529792 694728 0 0 0 25 292 190 14 86 0 0 0 122 0 0 771340 529792 694728 0 0 0 21 311 191 15 85 0 0 0 138 0 0 771464 529792 694728 0 0 0 0 365 196 14 86 0 0 0 146 0 0 771464 529792 694728 0 0 0 0 331 189 16 84 0 0 0 150 0 0 771472 529792 694728 0 0 0 0 336 183 15 85 0 0 0 146 0 0 771472 529792 694728 0 0 0 4 310 201 14 86 0 0 0 145 0 0 771472 529792 694728 0 0 0 0 285 163 15 85 0 0 0 procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------ r b swpd free inact active si so bi bo in cs us sy id wa st 146 0 0 771472 529792 694728 0 0 0 0 277 159 14 86 0 0 0 145 0 0 771472 529792 694728 0 0 0 32 275 133 15 85 0 0 0 0 0 0 771208 529892 694176 0 0 0 0 1012 3408 12 4 84 0 0 0 0 0 770712 529972 694488 0 0 0 149 774 2869 8 2 90 0 0 0 0 0 770712 529972 694488 0 0 0 0 271 195 0 0 100 0 0 0 0 0 770728 529972 694488 0 0 0 35 269 167 0 0 100 1 0 0 0 0 770728 529972 694488 0 0 0 7 269 189 0 0 100 0 0 The application is memory stable ( no leaks ) and a deadlock is out of the question since in a deadlock case the system would freeze forever and not temporarily. There are around 200 - 250 tcp/ip clients connected to the application and 550 threads ( streaming blocking sockets are used so every client is managed by one reading thread and one writing thread) The same application works fine on SLES9.3 Hanging Evironment: ----------------------------------------------------------------------------- mustang:~ # uname -a Linux mustang 2.6.16.21-0.25-smp #1 SMP Tue Sep 19 07:26:15 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux mustang:~ # java -version java version "1.6.0-rc" Java(TM) SE Runtime Environment (build 1.6.0-rc-b104) Java HotSpot(TM) Server VM (build 1.6.0-rc-b104, mixed mode) mustang:~ # cat /etc/SuSE-release SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 ----------------------------------------------------------------------------- Working environment: ----------------------------------------------------------------------------- apollo:~ # uname -a Linux apollo 2.6.5-7.252-smp #1 SMP Tue Feb 14 11:11:04 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux apollo:~ # java -version java version "1.6.0-rc" Java(TM) SE Runtime Environment (build 1.6.0-rc-b95) Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-rc-b95, mixed mode) apollo:~ # cat /etc/SuSE-release SUSE LINUX Enterprise Server 9 (x86_64) VERSION = 9 PATCHLEVEL = 3 ----------------------------------------------------------------------------- Can you give me some pointers about where to start debugging this issue? Regards, Horia -- View this message in context: http://www.nabble.com/Temporary-random-kernel-hang-tf2779860.html#a7755634 Sent from the linux-kernel mailing list archive at Nabble.com. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/