I had some success getting the Manage Old Data screen to work.  Most of the 
time it will throw a ConcurrentModificationException, but occasionally it 
will list a few hundred records with the "Discard Old Data" button.  I 
press the button and - again - sometimes it will "work" and sometimes throw 
a CME, but in either case it does seem to delete some of the old data.  I 
repeated this process about every hour a couple days ago and managed to 
delete enough old data that Jenkins continued to run for more than a day. 
 The best chance of this working is when no build jobs are running.

The alternative is to manually delete the disk-usage XML elements from the 
build.xml files in each job's build directories.  I did this for about 200 
files before I got tired of doing it.  A groovy script could probably be 
written to do this.

-tim

On Thursday, December 12, 2013 5:26:42 AM UTC-5, nigelm wrote:
>
> So this is what is happening for us :
>
> - The build-usage plugin was displaying the problems at the beginning of 
> the thread, so we disabled it.
> - Now, every build that we do, and every sub-project fills up the 'Old 
> data' log, with hundreds of 
> CannotResolveClassException: hudson.plugins.disk_usage.BuildDiskUsageAction
>
> even though that plugin is not used in that build, and does not exist any 
> more.
>
> After a modest number of builds (say, 1/2 a day or so), Jenkins bombs with 
> OOM as this log is filled with *millions* of entries, and it's game over.
>
> Is there a way to disable this functionality? I can't see the utility of 
> it, and it's making the system totally unusable.
>
>
>
> On Wed, Dec 11, 2013 at 5:55 PM, Nigel Magnay 
> <nigel....@gmail.com<javascript:>
> > wrote:
>
>> I've just cracked out MAT on a oom dump from our machine, and I can 
>> confirm that it looks like OldDataMonitor is the culprit here, too (750Mb 
>> of retained heap).
>>
>> There's over a million entries in the hashmap...
>>
>>
>>  
>>
>> On Mon, Dec 9, 2013 at 4:32 PM, Tim Drury <tdr...@gmail.com <javascript:>
>> > wrote:
>>
>>> I'm doing a heap-dump analysis now and I think I might know what the 
>>> issue was.  The start of this whole problem was the disk-usage plugin 
>>> hanging our attempts to view a job in Jenkins (see 
>>> https://issues.jenkins-ci.org/browse/JENKINS-20876) so we disabled that 
>>> plugin.  After disabling, Jenkins complained about data in an 
>>> older/unreadable format:
>>>
>>> You have data stored in an older format and/or unreadable data.
>>>
>>> If I click the "Manage" button to delete it, it takes a _long_ time for 
>>> it to display all the disk-usage plugin data - there must be thousands of 
>>> rows, but it does display it all eventually.  The error shown in each row 
>>> is:
>>>
>>> CannotResolveClassException: 
>>> hudson.plugins.disk_usage.BuildDiskUsageAction
>>>  
>>> If I click "Discard Unreadable Data" at the bottom of the page, I 
>>> quickly get a stack trace:
>>>
>>> javax.servlet.ServletException: java.util.ConcurrentModificationException
>>>  at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:735)
>>> at org.kohsuke.stapler.Stapler.invoke(Stapler.java:799)
>>>  at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:239)
>>> at 
>>> org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
>>>  at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:685)
>>> at org.kohsuke.stapler.Stapler.invoke(Stapler.java:799)
>>>  at org.kohsuke.stapler.Stapler.invoke(Stapler.java:587)
>>> at org.kohsuke.stapler.Stapler.service(Stapler.java:218)
>>>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:45)
>>> at winstone.ServletConfiguration.execute(ServletConfiguration.java:248)
>>>  at winstone.RequestDispatcher.forward(RequestDispatcher.java:333)
>>> at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:376)
>>>  at 
>>> hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96)
>>> at 
>>> net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:203)
>>>  at 
>>> net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:181)
>>> at 
>>> net.bull.javamelody.PluginMonitoringFilter.doFilter(PluginMonitoringFilter.java:86)
>>>
>>> and it fails to discard the data.  Older data isn't usually a problem so 
>>> I brushed off this error.  However, here is dominator_tree of the heap dump:
>>>
>>> Class Name                                                               
>>>                                                | Shallow Heap | Retained 
>>> Heap | Percentage
>>>
>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> hudson.diagnosis.OldDataMonitor @ 0x6f9f2c4a0                           
>>>                                                 |           24 | 
>>> 3,278,466,984 |     88.69%
>>> com.thoughtworks.xstream.converters.SingleValueConverterWrapper @ 
>>> 0x6f9da8780                                           |           16 |   
>>>  13,825,616 |      0.37%
>>> hudson.model.Hudson @ 0x6f9b8b8e8                                       
>>>                                                 |          272 |     
>>> 3,572,400 |      0.10%
>>> org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6f9a73598                 
>>>                                                |           88 |     
>>> 2,308,760 |      0.06%
>>> org.apache.commons.jexl.util.introspection.Introspector @ 0x6fbb74710   
>>>                                                 |           32 |     
>>> 1,842,392 |      0.05%
>>> org.kohsuke.stapler.WebApp @ 0x6f9c0ff10                                 
>>>                                                |           64 |     
>>> 1,127,480 |      0.03%
>>> java.lang.Thread @ 0x7d5c2d138  Handling GET 
>>> /view/Alle/job/common-translation-main/ : RequestHandlerThread[#105] 
>>> Thread|          112 |       971,336 |      0.03%
>>>
>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> What is hudson.diagnosis.OldDataMonitor?  Could the disk-usage plugin 
>>> data be the cause of all my recent OOM errors?  If so, how do I get rid of 
>>> it?
>>>
>>> -tim
>>>
>>>
>>> On Monday, December 9, 2013 9:41:25 AM UTC-5, Tim Drury wrote:
>>>>
>>>> I intended to install 1.532 on Friday, but mistakenly installed 1.539. 
>>>>  It gave us the same OOM exceptions.  I'm installing 1.532 now and will - 
>>>> hopefully - know tomorrow whether it's stable or not.  I'm not exactly 
>>>> sure 
>>>> what's going to happen with our plugins though.  Hopefully Jenkins will 
>>>> tell me if they must be downgraded too.
>>>>
>>>> -tim
>>>>
>>>> On Monday, December 9, 2013 7:45:28 AM UTC-5, Stephen Connolly wrote:
>>>>>
>>>>> How does the current LTS (1.532.1) hold up?
>>>>>
>>>>>
>>>>> On 6 December 2013 13:33, Tim Drury <tdr...@gmail.com> wrote:
>>>>>
>>>>>> We updated Jenkins to 1.542 two days ago (from 1.514) and we're 
>>>>>> getting a lot of OOM errors. (info: Windows server 2008 R2, Jenkins JVM 
>>>>>> is 
>>>>>> jdk-x64-1.6.0_26)
>>>>>>
>>>>>> At first I did the simplest thing and increased the heap from 3G to 
>>>>>> 4.2G (and bumped up permgen).  This didn't help so I started looking at 
>>>>>> threads via the Jenkins monitoring tool.  It indicated the disk-usage 
>>>>>> plugin was hung.  When you tried to view a page for a particularly large 
>>>>>> job, the page would "hang" and the stack trace showed the disk-usage 
>>>>>> plugin 
>>>>>> was to blame (or so I thought).  Jira report with thread dump here: 
>>>>>> https://issues.jenkins-ci.org/browse/JENKINS-20876<https://www.google.com/url?q=https%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-20876&sa=D&sntz=1&usg=AFQjCNFcjP8y2rafiviVJB5cLwC_Tn7MPg>
>>>>>>
>>>>>> We disabled the disk-usage plugin and restarted and now we can visit 
>>>>>> that job page.  However, we still get OOM and lots of GCs in the logs at 
>>>>>> least once a day.  The stack trace looks frighteningly similar to that 
>>>>>> from 
>>>>>> the disk-usage plugin.  Here is an edited stack trace showing the 
>>>>>> methods 
>>>>>> common between the two OOM incidents: one during the disk-usage plugin 
>>>>>> and 
>>>>>> one after it was disabled:
>>>>>>
>>>>>> [lots of xstream methods snipped]
>>>>>> hudson.XmlFile.unmarshal(XmlFile.java:165)
>>>>>> hudson.model.Run.reload(Run.java:323)
>>>>>> hudson.model.Run.<init>(Run.java:312)
>>>>>>  hudson.model.AbstractBuild.<init>(AbstractBuild.java:185)
>>>>>> hudson.maven.AbstractMavenBuild.<init>(AbstractMavenBuild.java:54)
>>>>>> hudson.maven.MavenModuleSetBuild.<init>(MavenModuleSetBuild.java:146)
>>>>>> ... [JVM methods snipped]
>>>>>> hudson.model.AbstractProject.loadBuild(AbstractProject.java:1155)
>>>>>> hudson.model.AbstractProject$1.create(AbstractProject.java:342)
>>>>>> hudson.model.AbstractProject$1.create(AbstractProject.java:340)
>>>>>> hudson.model.RunMap.retrieve(RunMap.java:225)
>>>>>> hudson.model.RunMap.retrieve(RunMap.java:59)
>>>>>> jenkins.model.lazy.AbstractLazyLoadRunMap.load(
>>>>>> AbstractLazyLoadRunMap.java:677)
>>>>>> jenkins.model.lazy.AbstractLazyLoadRunMap.load(
>>>>>> AbstractLazyLoadRunMap.java:660)
>>>>>> jenkins.model.lazy.AbstractLazyLoadRunMap.search(
>>>>>> AbstractLazyLoadRunMap.java:502)
>>>>>> jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(
>>>>>> AbstractLazyLoadRunMap.java:536)
>>>>>> hudson.model.AbstractProject.getBuildByNumber(
>>>>>> AbstractProject.java:1077)
>>>>>> hudson.maven.MavenBuild.getParentBuild(MavenBuild.java:165)
>>>>>> hudson.maven.MavenBuild.getWhyKeepLog(MavenBuild.java:273)
>>>>>> hudson.model.Run.isKeepLog(Run.java:572)
>>>>>> ...
>>>>>>
>>>>>> It seems something in "core" Jenkins has changed and not for the 
>>>>>> better.  Anyone seeing these issues?
>>>>>>
>>>>>> -tim
>>>>>>  
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "Jenkins Users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to jenkinsci-use...@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>
>>>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Jenkins Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to jenkinsci-use...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to