Hello,
One of the issue we have recently been experiencing with Jenkins is that the slaves (node) would go offline for no apparent reason and would not reconnect automatically. When slaves appear as offline, we tried to launch/reconnect the slave manually but it does not work either. However, we are able to SSH into the machine using PuTTy. The only workaround is to restart the Jenkins server, until the problem surfaces again. (Typically in a week.) Instance Information -------------------- Jenkins Server: 1.562 SSH Credentials Plugin: 1.6.1 SSH Slaves Plugin 1.6 Thread dump of slave node: {dump} "Channel reader thread: qa-linbuild-02" prio=5 WAITING java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:485) com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:109) com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:583) com.trilead.ssh2.Session.<init>(Session.java:41) com.trilead.ssh2.Connection.openSession(Connection.java:1129) com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99) com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119) hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1160) hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:437) hudson.remoting.Channel.terminate(Channel.java:819) hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76) "Channel reader thread: qa-linbuild-03" prio=5 WAITING java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:485) com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:109) com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:583) com.trilead.ssh2.Session.<init>(Session.java:41) com.trilead.ssh2.Connection.openSession(Connection.java:1129) com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99) com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119) hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1160) hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:437) hudson.remoting.Channel.terminate(Channel.java:819) hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76) {dump} Also concerning is the number of threads is in the BLOCKED (126!). Doesn't seem normal as there are no BLOCKED threads after the server is restarted. {dump} // 118 instances "Computer.threadPoolForRemoting [#26]" daemon prio=5 BLOCKED hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1152) hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:542) jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) java.util.concurrent.FutureTask.run(FutureTask.java:138) java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) java.lang.Thread.run(Thread.java:662) // 8 instances "Computer.threadPoolForRemoting [#2922]" daemon prio=5 BLOCKED hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:639) hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) java.util.concurrent.FutureTask.run(FutureTask.java:138) java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) java.lang.Thread.run(Thread.java:662) {dump} Looking forward to any ideas or suggestions. Thank you. Charles Chan -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.