[ https://issues.jenkins-ci.org/browse/JENKINS-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=161942#comment-161942 ]
Hans-Juergen Hafner commented on JENKINS-6817: ---------------------------------------------- Hi, I´m not sure if this is the right place to write about our observations about channel termination. Several times a day Jenkins loses some slaves. Currently we are running Jenkins 1.454, but we saw the problem also with 1.457. The master and almost all slaves are running on Linux machines. I suspect the problem has something to do with garbage collection. When the problem occurs, Jenkins occupies all 24 hyper-threads of CPU almost 100%. Here the heap usage (from space 99%): {noformat} Heap PSYoungGen total 7202048K, used 23148K [0x0000000600000000, 0x0000000800000000, 0x0000000800000000) eden space 7181888K, 0% used [0x0000000600000000,0x00000006002ecb30,0x00000007b6590000) from space 20160K, 99% used [0x00000007b6590000,0x00000007b793e4f8,0x00000007b7940000) to space 26624K, 0% used [0x00000007fe600000,0x00000007fe600000,0x0000000800000000) PSOldGen total 8388608K, used 2438211K [0x0000000200000000, 0x0000000400000000, 0x0000000600000000) object space 8388608K, 29% used [0x0000000200000000,0x0000000294d10d10,0x0000000400000000) PSPermGen total 1048576K, used 85742K [0x00000001c0000000, 0x0000000200000000, 0x0000000200000000) object space 1048576K, 8% used [0x00000001c0000000,0x00000001c53bb888,0x0000000200000000) {noformat} Excerpt from Jenkins log (master) {noformat} Apr 24, 2012 11:14:16 AM hudson.remoting.Channel$ReaderThread run SEVERE: I/O error in channel ullteb15 java.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1127) Apr 24, 2012 11:14:16 AM hudson.remoting.Channel$ReaderThread run SEVERE: I/O error in channel ullteb16 java.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1127) Apr 24, 2012 11:14:26 AM hudson.remoting.Request$2 run SEVERE: Failed to send back a reply java.io.IOException: Broken pipe at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(Unknown Source) at java.io.BufferedOutputStream.flushBuffer(Unknown Source) at java.io.BufferedOutputStream.flush(Unknown Source) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(Unknown Source) at java.io.ObjectOutputStream.flush(Unknown Source) at hudson.remoting.Channel.send(Channel.java:505) at hudson.remoting.Request$2.run(Request.java:301) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Apr 24, 2012 11:14:26 AM hudson.remoting.Channel$ReaderThread run SEVERE: I/O error in channel ulcppit01 java.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1127) Apr 24, 2012 11:14:27 AM hudson.remoting.Request$2 run SEVERE: Failed to send back a reply java.io.IOException: Broken pipe at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(Unknown Source) at java.io.BufferedOutputStream.flushBuffer(Unknown Source) at java.io.BufferedOutputStream.flush(Unknown Source) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(Unknown Source) at java.io.ObjectOutputStream.flush(Unknown Source) at hudson.remoting.Channel.send(Channel.java:505) at hudson.remoting.Request$2.run(Request.java:301) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Apr 24, 2012 11:14:26 AM hudson.remoting.Request$2 run SEVERE: Failed to send back a reply {noformat} And here log from slave ullteb15 {noformat} Ping failed. Terminating ERROR: Connection terminated [8mha:AAAAWB+LCAAAAAAAAABb85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=[0mjava.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1127) ERROR: Process terminated with exit code 255 [8mha:AAAAWB+LCAAAAAAAAABb85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=[0m {noformat} Log from slave ullteb28 {noformat} Apr 24, 2012 11:13:56 AM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1335258596049 hasn't completed at 1335258836049 at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:249) at hudson.remoting.Request$1.get(Request.java:184) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Apr 24, 2012 11:13:56 AM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1335258596049 hasn't completed at 1335258836053 at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:249) at hudson.remoting.Request$1.get(Request.java:184) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Connection terminated channel stopped {noformat} > FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: > Unexpected termination of the channel > ---------------------------------------------------------------------------------------------------------- > > Key: JENKINS-6817 > URL: https://issues.jenkins-ci.org/browse/JENKINS-6817 > Project: Jenkins > Issue Type: Bug > Components: clone-workspace, core > Affects Versions: current > Reporter: nirmal_patel > Assignee: abayer > Priority: Blocker > > I am seeing the same on my Windows XP master-slave setup. I am running latest > Hudson ver. 1.363 > I am using the close-workspace-scm plugin to copy my workspace from master to > slave(150). > Started by user anonymous > Building remotely on 150 > FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: > Unexpected termination of the channel > hudson.remoting.RequestAbortedException: > hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected > termination of the channel > at hudson.remoting.Request.call(Request.java:137) > at hudson.remoting.Channel.call(Channel.java:555) > at hudson.FilePath.act(FilePath.java:742) > at hudson.FilePath.act(FilePath.java:735) > at hudson.FilePath.unzip(FilePath.java:415) > at > hudson.FileSystemProvisioner$Default$WorkspaceSnapshotImpl.restoreTo(FileSystemProvisioner.java:227) > at > hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:344) > at > hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:126) > at hudson.model.AbstractProject.checkout(AbstractProject.java:1044) > at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479) > at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411) > at hudson.model.Run.run(Run.java:1253) > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) > at hudson.model.ResourceController.execute(ResourceController.java:88) > at hudson.model.Executor.run(Executor.java:127) > Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: > Unexpected termination of the channel > at hudson.remoting.Request.abort(Request.java:257) > at hudson.remoting.Channel.terminate(Channel.java:602) > at hudson.remoting.Channel$ReaderThread.run(Channel.java:893) > Caused by: java.io.IOException: Unexpected termination of the channel > at hudson.remoting.Channel$ReaderThread.run(Channel.java:875) > Caused by: java.io.EOFException > at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) > at java.io.ObjectInputStream.readObject0(Unknown Source) > at java.io.ObjectInputStream.readObject(Unknown Source) > at hudson.remoting.Channel$ReaderThread.run(Channel.java:869) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira