We updated to 1.6.24 last Saturday, since then, we're running into problems where the Web interface gets suddenly slow, then eventually stops responding, though jobs are still processing.
I installed the javamelody plugin after the 2nd time, it show a sudden spike in memory from ~ 2GB to near the Heap max on the master. In the memory histogram view it shows de.esailors.jenkins.teststability.StabilityTestData$Result taking 5GB (74%) of the heap, with 216M "instances". java.lang.object[] is taking 1.2GB, 325K instances. It also shows outstanding requests for $stapler/bound/a3eefa4e-d1b7-4112-a60d-f58bd64f4bb1/rerunBuild ajax POST <http://wd-jenkins-master.swg.usma.ibm.com:8072/monitoring?part=graph&graph=httpae096f0191a18d40f1d8a163693cbe2515af9b76> / Handling POST /$stapler/bound/f3fef2df-4d0f-4207-af7d-d296c4659d28/rerunBuild from [x.x.x.x]: RequestHandlerThread[#10] that have been running for a very long time. It correlates with someone clicking on the re-run button in a build pipeline view. There's nothing obvious in the logs around the time of each incident, but each does have something like WARNING: Failed to load [*path to jenkins home/somejob]* /builds/576/junitResult.xml java.io.FileNotFoundException: [*path to jenkins home/somejob]*builds/576/junitResult.xml (No such file or directory) If I look in the filesystem, that file is there. There are also some of these, that I thought might be due to problems in getPreviousResult() brought on by having kill the server and restart it while jobs are in flight, while the UI is hung. Aug 20, 2015 1:48:21 PM org.eclipse.jetty.util.log.JavaUtilLog warn WARNING: Error while serving http://[*server name]*/view/[*view name]*/job/[*job name]*/jacoco/graph java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [ big stack trace that ends with ] Caused by: java.lang.NullPointerException at hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:199) at hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:107) at hudson.plugins.jacoco.JacocoBuildAction.getPreviousResult(JacocoBuildAction.java:249) at hudson.plugins.jacoco.JacocoBuildAction.getPreviousResult(JacocoBuildAction.java:240) at hudson.plugins.jacoco.JacocoBuildAction.getPreviousResult(JacocoBuildAction.java:36) at hudson.plugins.jacoco.model.CoverageObject$1.createDataSet(CoverageObject.java:379) at hudson.plugins.jacoco.model.CoverageObject$GraphImpl.createGraph(CoverageObject.java:418) at hudson.util.Graph.render(Graph.java:87) at hudson.util.Graph.doPng(Graph.java:98) at hudson.plugins.jacoco.model.CoverageObject.doGraph(CoverageObject.java:373) at hudson.plugins.jacoco.JacocoProjectAction.doGraph(JacocoProjectAction.java:53) and similarly Aug 21, 2015 10:30:03 AM hudson.ExpressionFactory2$JexlExpression evaluate ... Caused by: java.lang.IllegalStateException: hudson.tasks.junit.TestResultAction@1b176aa2 was attached to both [jobname] #574 and [jobname] #576 at hudson.tasks.test.AbstractTestResultAction.getPreviousResult(AbstractTestResultAction.java:229) WARNING: Caught exception evaluating: it.failureDiffString in /view/[*view name]*/job/[*job name]*. Reason: java.lang.reflect.InvocationTargetException We upgraded to 1.625 and increased the heap, still hitting this. We have 224 jobs, 50 workers Ideas welcome. I can try disabling the job stability plugin, but it's very useful. -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/5ac6c0c6-355a-4f7a-ad21-e6e9417da8c6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.