We updated to 1.6.24 last Saturday, since then, we're running into problems 
where the Web interface gets suddenly slow, then eventually stops 
responding, though jobs are still processing.

I installed the javamelody plugin after the 2nd time, it show a sudden 
spike in memory from ~ 2GB to near the Heap max on the master.    

In the memory histogram view it shows 
de.esailors.jenkins.teststability.StabilityTestData$Result taking 5GB (74%) 
of the heap, with 216M "instances".   java.lang.object[] is taking 1.2GB, 
325K instances.

It also shows outstanding requests for 
$stapler/bound/a3eefa4e-d1b7-4112-a60d-f58bd64f4bb1/rerunBuild 
ajax POST 
<http://wd-jenkins-master.swg.usma.ibm.com:8072/monitoring?part=graph&graph=httpae096f0191a18d40f1d8a163693cbe2515af9b76>
 
/ Handling POST /$stapler/bound/f3fef2df-4d0f-4207-af7d-d296c4659d28/rerunBuild 
from 
[x.x.x.x]: RequestHandlerThread[#10]
that have been running for a very long time.  It correlates with someone 
clicking on the re-run button in a build pipeline view.

There's nothing obvious in the logs around the time of each incident, but 
each does have something like

WARNING: Failed to load [*path to jenkins home/somejob]*
/builds/576/junitResult.xml
java.io.FileNotFoundException: [*path to jenkins 
home/somejob]*builds/576/junitResult.xml 
(No such file or directory)

If I look in the filesystem, that file is there.

There are also some of these, that I thought might be due to problems in 
getPreviousResult() brought on by having kill the server and restart it 
while jobs are in flight, while the UI is hung.

Aug 20, 2015 1:48:21 PM org.eclipse.jetty.util.log.JavaUtilLog warn
WARNING: Error while serving http://[*server name]*/view/[*view 
name]*/job/[*job 
name]*/jacoco/graph
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ big stack trace that ends with ]
Caused by: java.lang.NullPointerException
        at 
hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:199)
        at 
hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:107)
        at 
hudson.plugins.jacoco.JacocoBuildAction.getPreviousResult(JacocoBuildAction.java:249)
        at 
hudson.plugins.jacoco.JacocoBuildAction.getPreviousResult(JacocoBuildAction.java:240)
        at 
hudson.plugins.jacoco.JacocoBuildAction.getPreviousResult(JacocoBuildAction.java:36)
        at 
hudson.plugins.jacoco.model.CoverageObject$1.createDataSet(CoverageObject.java:379)
        at 
hudson.plugins.jacoco.model.CoverageObject$GraphImpl.createGraph(CoverageObject.java:418)
        at hudson.util.Graph.render(Graph.java:87)
        at hudson.util.Graph.doPng(Graph.java:98)
        at 
hudson.plugins.jacoco.model.CoverageObject.doGraph(CoverageObject.java:373)
        at 
hudson.plugins.jacoco.JacocoProjectAction.doGraph(JacocoProjectAction.java:53)

and similarly 
Aug 21, 2015 10:30:03 AM hudson.ExpressionFactory2$JexlExpression evaluate
...
Caused by: java.lang.IllegalStateException: 
hudson.tasks.junit.TestResultAction@1b176aa2 was attached to both [jobname] 
#574 and [jobname] #576
    at 
hudson.tasks.test.AbstractTestResultAction.getPreviousResult(AbstractTestResultAction.java:229)
WARNING: Caught exception evaluating: it.failureDiffString in /view/[*view 
name]*/job/[*job name]*. Reason: java.lang.reflect.InvocationTargetException
We upgraded to 1.625 and increased the heap, still hitting this. 

We have 224 jobs, 50 workers
Ideas welcome.  
I can try disabling the job stability plugin, but it's very useful.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/5ac6c0c6-355a-4f7a-ad21-e6e9417da8c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to