You can attack this one of two ways: either playing with how Java does GC, and 
how much G you have to C.  I’ll let someone smarter than I guide you on how to 
keep GC from stopping the world, but I think that the basic problem is that 
you’ve got a 100 GB heap.

How many old job runs are you keeping?  Jenkins keeps data (modulo build logs 
and artifacts) on every run it remembers, and it keeps it in memory.  In our 
shop, we only keep build results on Jenkins for a week or so, and part of the 
build process is to persist the results to an RDBMS for long-term history.  If 
your builds aren’t set to “Discard Old Builds”, that’s your problem.  With an 
installation that size, you can’t afford to keep records going back forever in 
Jenkins itself.

If the problem is too many old builds, you’ll need to set your jobs to discard 
old builds, and then probably reboot it (it won’t immediately “forget” old 
runs).  Somebody around here should remember some Groovy magic that will allow 
you to “forget” old builds after you set the discard rule and without rebooting 
the server.

If your system is so complex that you need a 100 GB heap and you need better 
control over the JVM, remember (if you’re not doing it already) that you can 
run Jenkins as its own server, so you’re not beholden to Tomcat, JBoss, or 
whatever app server you’re running.  If it matters, my installation has 100+ 
nodes, 200+ jobs, and 1000-2000 build runs recorded at any given time, and we 
run in 4 GB running it as its own server with no problem.

--Rob

From: jenkinsci-users@googlegroups.com 
[mailto:jenkinsci-users@googlegroups.com] On Behalf Of icarusnine
Sent: Tuesday, October 09, 2012 5:53 AM
To: jenkinsci-users@googlegroups.com
Subject: Performance problems on Jenkins master(very long minor gc with 
stop-the-world)


Hello.

We have a very large Jenkins set up that includes on master node with 100+ 
slaves and 1000+ jobs.
We have reasons for keeping just a single master node so it isn't possible 
split our Hudson master.

Now, we are experiencing performance problems(minor gc happens frequently and 
it is performed over 1~2 minutes and it made stop-the-world.)
However, full gc is performed within 20~30 seconds.
Our heap size is over 100G so it is hard to generate and analysis heap dump.

Does anyone have any experience with very large Hudson installations like this?
Is there any advice for tuning or recommendations for this issue?


Also, please do let me know if there is any other data that I can provide that 
would help with analysis.
Thanks for any help you can provide.

----------------------------------------------------------------
Jenkins info
----------------------------------------------------------------
Core ver : 1.424.6
WAS : weblogic 10.3.2
JAVA : jdk1.6.0.34
JVM OPTION :
-Xms180g -Xmx180g -XX:NewSize=140g -XX:MaxNewSize=140g -XX:PermSize=1024m 
-XX:MaxPermSize=1024m
-XX:-UseGCOverheadLimit -XX:+UseParallelGC -XX:SurvivorRatio=8 -verbosegc 
-Xloggc:app_gc.log -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -Djava.awt.headless=true

----------------------------------------------------------------
Server spec
----------------------------------------------------------------
CPU : Intel(R) Xeon(R) CPU E5-2690 2.90GHz * 4 (32 core)
RAM : 256GB

----------------------------------------------------------------
Gc log
----------------------------------------------------------------
60286.042: [GC [PSYoungGen: 8655292K->8714K(40587584K)] 
25909185K->17262608K(208359744K), 0.0248360 secs] [Times: user=0.38 sys=0.01, 
real=0.03 secs]
60286.067: [Full GC (System) [PSYoungGen: 8714K->0K(40587584K)] [ParOldGen: 
17253893K->17228395K(167772160K)] 17262608K->17228395K(208359744K) [PSPermGen: 
194622K->194622K(2097152K)], 1.8638320 secs] [Times: user=33.26 sys=0.23, 
real=1.86 secs]
60748.860: [GC [PSYoungGen: 39173056K->532623K(40528512K)] 
56401451K->17761019K(208300672K), 0.0837520 secs] [Times: user=1.19 sys=0.00, 
real=0.08 secs]
61243.483: [GC [PSYoungGen: 39705679K->29759K(40658432K)] 
56934075K->17272524K(208430592K), 0.0558890 secs] [Times: user=0.49 sys=0.00, 
real=0.05 secs]
61805.663: [GC [PSYoungGen: 39346943K->28331K(40601792K)] 
56589708K->17275705K(208373952K), 0.0544110 secs] [Times: user=0.49 sys=0.01, 
real=0.06 secs]
62383.664: [GC [PSYoungGen: 39345515K->33640K(40776640K)] 
56592889K->17284373K(208548800K), 0.0592330 secs] [Times: user=0.49 sys=0.00, 
real=0.06 secs]
..........................
85842.953: [GC [PSYoungGen: 38973565K->1818337K(40038592K)] 
80709421K->44276973K(207810752K), 22.0442750 secs] [Times: user=2.44 
sys=503.41, real=22.04 secs]
85976.095: [GC [PSYoungGen: 40038561K->1904445K(37126592K)] 
82497204K->46320890K(204898752K), 49.0663710 secs] [Times: user=2.88 
sys=1117.05, real=49.06 secs]
86147.499: [GC [PSYoungGen: 37126456K->1721075K(38582592K)] 
81542901K->48037517K(206354752K), 39.6267960 secs] [Times: user=2.81 
sys=904.88, real=39.62 secs]
86265.898: [GC [PSYoungGen: 36943219K->1147657K(38796608K)] 
83259661K->49166685K(206568768K), 43.2677960 secs] [Times: user=6.13 
sys=985.33, real=43.26 secs]
86435.957: [GC [PSYoungGen: 36592456K->748179K(38591488K)] 
84611484K->49915859K(206363648K), 34.1037910 secs] [Times: user=2.48 
sys=780.02, real=34.10 secs]
86560.263: [GC [PSYoungGen: 36192756K->448475K(38982464K)] 
85360436K->50343633K(206754624K), 27.3025220 secs] [Times: user=1.64 
sys=623.52, real=27.29 secs]
86594.685: [GC [PSYoungGen: 36402298K->106372K(38914816K)] 
86297455K->50438940K(206686976K), 15.7548480 secs] [Times: user=1.88 
sys=359.23, real=15.76 secs]


The information in this message is for the intended recipient(s) only and may 
be the proprietary and/or confidential property of Litle & Co., LLC, and thus 
protected from disclosure. If you are not the intended recipient(s), or an 
employee or agent responsible for delivering this message to the intended 
recipient, you are hereby notified that any use, dissemination, distribution or 
copying of this communication is prohibited. If you have received this 
communication in error, please notify Litle & Co. immediately by replying to 
this message and then promptly deleting it and your reply permanently from your 
computer.

Reply via email to