[ 
https://issues.jenkins-ci.org/browse/JENKINS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Schulz updated JENKINS-13330:
-------------------------------------

    Attachment: jenkins-stall-threaddump.gz

A thread dump of the Jenkins master process when there were 8 stalled (slave) 
jobs.

[Apparently I did not succeed last time.]
                
> Jenkins slave hangs in post build phase
> ---------------------------------------
>
>                 Key: JENKINS-13330
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13330
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-status
>         Environment: RHEL 5, both master and all slaves.
> Jenkins is running inside of Tomcat
>            Reporter: Clark Wright
>            Priority: Critical
>         Attachments: jenkins-stall-threaddump.gz, 
> jenkins-stall-threaddump.gz, Screenshot-galleon_allIntegration #1196 Console 
> [Jenkins] - Mozilla Firefox.png
>
>
> We have an intermittent problem with slaves hanging AFTER the job itself is 
> finished. In the post processing step (?) what we see is that the console log 
> has this line:
> Description set: vap_current_iter-2012_03_29_19_01_03
> And then nothing. Usually, it will look like this:
> Description set: prod_pull-2012_03_28_19_01_03
> Notifying upstream build armada_Launch_prod_pull #13 of job completion
> Project armada_Launch_prod_pull still waiting for 1 builds to complete
> Notifying upstream projects of job completion
> Notifying upstream of completion: armada_Launch_prod_pull #13
> Finished: SUCCESS
> I setup a logger for hudson.model.Run, and it currently has this :
>     at java.lang.Thread.run(Thread.java:619)
> Mar 30, 2012 12:44:00 PM hudson.model.Run run
> INFO: galleon_allUnit #1134 main build action completed: SUCCESS
> Mar 30, 2012 12:44:00 PM hudson.model.Run setResult
> FINE: galleon_allUnit #1134 : result is set to SUCCESS
> java.lang.Exception
>     at hudson.model.Run.setResult(Run.java:352)
>     at hudson.model.Run.run(Run.java:1410)
>     at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
>     at hudson.model.ResourceController.execute(ResourceController.java:88)
>     at hudson.model.Executor.run(Executor.java:238)
> Repeated for every hung slave.
> The main hudson log doesn't have any additional information.
> Disconnecting the slave has no effect.
> Trying to do an orderly shutdown of Jenkins has no effect (jenkins actually 
> appears to hang on shutdown).
> The only way we have found to recover is to kill -9 the tomcat process.
> The tread dump for one of the slaves (they are all the same) is:
> Thread Dump
> Channel reader thread: channel
> "Channel reader thread: channel" Id=9 Group=main RUNNABLE (in native)
>     at java.io.FileInputStream.readBytes(Native Method)
>     at java.io.FileInputStream.read(FileInputStream.java:199)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>     -  locked java.io.BufferedInputStream@1ae615a
>     at 
> java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
>     at 
> java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
>     at 
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>     at hudson.remoting.Channel$ReaderThread.run(Channel.java:1030)
> main
> "main" Id=1 Group=main WAITING on hudson.remoting.Channel@e1d5ea
>     at java.lang.Object.wait(Native Method)
>     -  waiting on hudson.remoting.Channel@e1d5ea
>     at java.lang.Object.wait(Object.java:485)
>     at hudson.remoting.Channel.join(Channel.java:766)
>     at hudson.remoting.Launcher.main(Launcher.java:420)
>     at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:366)
>     at hudson.remoting.Launcher.run(Launcher.java:206)
>     at hudson.remoting.Launcher.main(Launcher.java:168)
> Ping thread for channel hudson.remoting.Channel@e1d5ea:channel
> "Ping thread for channel hudson.remoting.Channel@e1d5ea:channel" Id=10 
> Group=main TIMED_WAITING
>     at java.lang.Thread.sleep(Native Method)
>     at hudson.remoting.PingThread.run(PingThread.java:86)
> Pipe writer thread: channel
> "Pipe writer thread: channel" Id=12 Group=main WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@14263ed
>     at sun.misc.Unsafe.park(Native Method)
>     -  waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@14263ed
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>     at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
>     at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>     at java.lang.Thread.run(Thread.java:619)
> pool-1-thread-267
> "pool-1-thread-267" Id=285 Group=main RUNNABLE
>     at sun.management.ThreadImpl.dumpThreads0(Native Method)
>     at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:374)
>     at hudson.Functions.getThreadInfos(Functions.java:872)
>     at 
> hudson.util.RemotingDiagnostics$GetThreadDump.call(RemotingDiagnostics.java:93)
>     at 
> hudson.util.RemotingDiagnostics$GetThreadDump.call(RemotingDiagnostics.java:89)
>     at hudson.remoting.UserRequest.perform(UserRequest.java:118)
>     at hudson.remoting.UserRequest.perform(UserRequest.java:48)
>     at hudson.remoting.Request$2.run(Request.java:287)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:619)
>     Number of locked synchronizers = 1
>     - java.util.concurrent.locks.ReentrantLock$NonfairSync@1186f88
> Finalizer
> "Finalizer" Id=3 Group=system WAITING on 
> java.lang.ref.ReferenceQueue$Lock@1798fdd
>     at java.lang.Object.wait(Native Method)
>     -  waiting on java.lang.ref.ReferenceQueue$Lock@1798fdd
>     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
>     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
>     at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
> Reference Handler
> "Reference Handler" Id=2 Group=system WAITING on 
> java.lang.ref.Reference$Lock@1d40442
>     at java.lang.Object.wait(Native Method)
>     -  waiting on java.lang.ref.Reference$Lock@1d40442
>     at java.lang.Object.wait(Object.java:485)
>     at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
> Signal Dispatcher
> "Signal Dispatcher" Id=4 Group=system RUNNABLE
> Any ideas on how to better recover or prevent this would be greatly 
> appreciated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to