Hi Sverre Moe, I am the person who talked to you this morning :-)
Long term solution is to avoid building on the master to avoid performance issue and the need to increase the number of processes and open files in the machine where the jenkins master is located. Building on the master is also not recommended from a security point of view. Short term solution would be to increase the number of new processes on this machine + take thread dumps from the master each 10 minutes. For this, you can create a cron freestyle job executed every 10 minutes executing jstack <JENKINS_PID>. When the issue happens, you could take a look at the latest 10 builds with their thread dumps and try to figure out what is actually consuming so many threads on the master. I hope this helps, El miércoles, 14 de agosto de 2019, 15:38:17 (UTC+2), Devin Nusbaum escribió: > > I have not read the whole thread in detail, but the “Unable to create new > native thread” OutOfMemoryErrors from your original thread where one of the > stack traces involves > org.jenkinsci.plugins.ssegateway.sse.EventDispatcher.scheduleRetryQueueProcessing > looks > like it could be related to > https://issues.jenkins-ci.org/browse/JENKINS-58684, which is a thread > leak caused by the SSE Gateway Plugin. You could try reverting the SSE > Gateway Plugin to version 1.17 to see if that helps, although that might > reintroduce a different, somewhat rarer memory leak ( > https://issues.jenkins-ci.org/browse/JENKINS-51057). To test my > hypothesis, if you are running SSE Gateway Plugin version 1.19, you can > collect thread dumps over time and see if you seem to have a large number > of threads named “EventDispatcher.retryProcessor” (unfortunately in version > 1.18 and below the threads are automatically named “Timer #n”, which is > less useful), which would confirm that you are hitting JENKINS-58684 > <https://issues.jenkins-ci.org/browse/JENKINS-58684>. > > The advice to stop building on master is definitely a good idea as well. > > On Aug 14, 2019, at 07:11, Sverre Moe <sver...@gmail.com <javascript:>> > wrote: > > We got an 30 minute free CloudBees support. It was too short to dig deeper > to find the problem, but the person I was talking to (after examining our > logs) mentioned what he thought was the problem and gave a suggestion. > > We should not use Jenkins master at all for builds (allocated with the > node("master") step). We had 15 Executors for Jenkins master. > > We could also try to Increase limits of hard nofile and nproc for jenkins > user, but the main recomondation was to remove all Executors for Jenkins > master. > > /etc/security/limits.conf > jenkins soft core unlimited > jenkins hard core unlimited > jenkins soft fsize unlimited > jenkins hard fsize unlimited > jenkins soft nofile 4096 > jenkins hard nofile 10240 #Was 8192 > jenkins soft nproc 30654 > jenkins hard nproc 60654 #Was 30654 > > > To remove Jenkins master Executors will take some time. We use Jenkins > master when we publish our build artifacts RPMs to our NFS file storage. > Since our RPM NFS is only attached to the Jenkins master it is not > possible at the moment. Unless we can use any other agent, then do a SCP > onto our Jenkins master with the RPM artifacts. > > > We had a few other circumstances where we used Jenkins master. Like > checking out a file to determine which build agent to actually use. These I > have already changed to use any available build agent instead. > > tirsdag 6. august 2019 09.48.50 UTC+2 skrev Sverre Moe følgende: >> >> Sadly I was mistaken. We do not use NFS for JENKINS_HOME. >> >> We do however use NFS for the location where builds copy the RPM build >> artifacts. >> >> mandag 5. august 2019 22.17.46 UTC+2 skrev Ivan Fernandez Calvo følgende: >>> >>> Hi, >>> >>> Severe has another email thread open, I think it is the same Jenkins >>> instance >>> https://groups.google.com/d/msgid/jenkinsci-users/cc2d0bdb-b15f-4bec-a0a3-0562ea8c7df7%40googlegroups.com?utm_medium=email&utm_source=footer. >>> >>> I dunno what happens on your instance but probably it isn’t better that you >>> open another email thread with the description of your issue >> >> > -- > You received this message because you are subscribed to the Google Groups > "Jenkins Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to jenkins...@googlegroups.com <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/jenkinsci-users/3e728790-b2f5-4ae1-a9fe-512a5c912d61%40googlegroups.com > > <https://groups.google.com/d/msgid/jenkinsci-users/3e728790-b2f5-4ae1-a9fe-512a5c912d61%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/21a93499-136d-4f69-958e-5160e38d71e1%40googlegroups.com.