Wow, can't believe I didn't know about /threadDump before, thanks! Anyways I mostly see lots (on the order of 100s) of:
"Computer.threadPoolForRemoting [#85811]" Id=380625 Group=main TIMED_WAITING on java.util.concurrent.SynchronousQueue$TransferStack@687826ae at sun.misc.Unsafe.park(Native Method) - waiting on java.util.concurrent.SynchronousQueue$TransferStack@687826ae at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) I did see some (~20) CPS executors that looked suspicious (they're also hanging forever when i go to the job): "Running CpsFlowExecution[Owner[workflow_test/619:workflow_test#619]]" Id=7310 Group=main RUNNABLE at org.kohsuke.stapler.framework.io.LargeText$BufferSession.skip(LargeText.java:532) at org.kohsuke.stapler.framework.io.LargeText.writeLogTo(LargeText.java:211) at hudson.console.AnnotatedLargeText.writeRawLogTo(AnnotatedLargeText.java:162) at org.jenkinsci.plugins.workflow.job.WorkflowRun.copyLogs(WorkflowRun.java:359) at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$600(WorkflowRun.java:107) at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:752) - locked java.util.concurrent.atomic.AtomicBoolean@7eb2df49 at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:799) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$4.run(CpsThreadGroup.java:320) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:32) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Number of locked synchronizers = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@d3b1e0f I noticed them before but I assumed they weren't related to this, issue, I think they're waiting for slaves nodes that have been removed from our jenkins (not just disconnected but deleted), I'll clean these up with the /kill url on workflows and see if that improves things since the threaddump makes me wonder if these are related. Thanks again On Thursday, January 14, 2016 at 4:14:16 PM UTC-5, Christopher Orr wrote: > > On 14/01/16 21:30, Kevin wrote: > > Hi all, I've got some custom cloud providers that don't seem to get > > triggered, I've traced it back to NodeProvisionerInvoker not being > > called (even after 30 minutes of waiting), I can call the > > nodeProvisioners by hand (using the update function) in the same way > > that the NodeProvisionerInvoker does in the groovy script console and > > things seem to work. Actually it seems like a number of bits related to > > PeriodicWork classes aren't working since I assume git polling and the > > periodic build setting are a PeriodicWork things, but they're also not > > triggering. > > > > I did also look at PeriodicWork.all().collect {it.class.name} > > and println jenkins.util.Timer.get() and got: > > > > hudson.diagnosis.HudsonHomeDiskUsageChecker, > > hudson.diagnosis.MemoryUsageMonitor, > > hudson.model.FingerprintCleanupThread, > > hudson.model.LoadStatistics$LoadStatisticsUpdater, > > hudson.model.WorkspaceCleanupThread, > > hudson.slaves.ComputerRetentionWork, > > hudson.slaves.ConnectionActivityMonitor, > > hudson.slaves.NodeProvisioner$NodeProvisionerInvoker, > > hudson.triggers.Trigger$Cron, > > jenkins.model.DownloadSettings$DailyCheck, > > org.jenkinsci.plugins.periodicreincarnation.PeriodicReincarnation] > > > > > > and > > > > > jenkins.util.ErrorLoggingScheduledThreadPoolExecutor@6d72bb0e[Running, > > pool size = 10, active threads = 10, queued tasks = 2521, completed > > tasks = 1134830] > > If the number of active threads remains equal to the pool size, then I > would guess that some tasks are getting stuck. > > Can you see a bunch of timer-related or otherwise suspicious-looking > tasks running if you go to /threadDump on your Jenkins instance? > > Regards, > Chris > -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/2b6e0c8f-1ad7-4a48-9b3f-c2f78d05a8ea%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.