[ 
https://issues.jenkins-ci.org/browse/JENKINS-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=160958#comment-160958
 ] 

Uwe Stuehler edited comment on JENKINS-9882 at 3/29/12 3:09 PM:
----------------------------------------------------------------

It seems to me that this bug might not have been solved completely, or it is a 
different but similar issue we're seeing.

After about a day of normal operation, we see connections via HTTP and AJP 
being accept()ed and that the thread which is accepting these connections is 
then going back to poll()...

And then _nothing_ happens with the new connection, not a single read() or 
anything. New threads aren't created either. We're at exactly 200 
"RequestHandlerThread" threads in state RUNNABLE.

Is 200 a fixed limit on the number of request handler threads?

*Edit*: added stack trace

All RequestHandlerThread threads have this exact same backtrace:

{noformat}
"RequestHandlerThread[#871]" daemon prio=10 tid=0x00007fee8527a800 nid=0xf2c 
runnable [0x00007fee7d493000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.DataInputStream.readFully(DataInputStream.java:178)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at 
winstone.ajp13.Ajp13IncomingPacket.<init>(Ajp13IncomingPacket.java:60)
        at 
winstone.ajp13.Ajp13Listener.allocateRequestResponse(Ajp13Listener.java:170)
        at winstone.RequestHandlerThread.run(RequestHandlerThread.java:67)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{noformat}
                
      was (Author: ustuehler):
    It seems to me that this bug might not have been solved completely, or it 
is a different but similar issue we're seeing.

After about a day of normal operation, we see connections via HTTP and AJP 
being accept()ed and that the thread which is accepting these connections is 
then going back to poll()...

And then _nothing_ happens with the new connection, not a single read() or 
anything. New threads aren't created either. We're at exactly 200 
"RequestHandlerThread" threads in state RUNNABLE.

Is 200 a fixed limit on the number of request handler threads?

*Edit*: added stack trace

{noformat}
"RequestHandlerThread[#871]" daemon prio=10 tid=0x00007fee8527a800 nid=0xf2c 
runnable [0x00007fee7d493000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.DataInputStream.readFully(DataInputStream.java:178)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at 
winstone.ajp13.Ajp13IncomingPacket.<init>(Ajp13IncomingPacket.java:60)
        at 
winstone.ajp13.Ajp13Listener.allocateRequestResponse(Ajp13Listener.java:170)
        at winstone.RequestHandlerThread.run(RequestHandlerThread.java:67)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{noformat}
                  
> Jenkins runs out of file descriptors (winstone problem)
> -------------------------------------------------------
>
>                 Key: JENKINS-9882
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-9882
>             Project: Jenkins
>          Issue Type: Bug
>          Components: core
>    Affects Versions: current
>         Environment: Environment: Debian 5, sun-java-jdk (1.6.0_22)
> Jenkins version: 1.414-SNAPSHOT
>            Reporter: Santeri Paavolainen
>
> Running Jenkins with the embedded Winstone server for a long time
> under constant load conditions causes file descriptor and thread
> leakage.
> Environment: Debian 5, sun-java-jdk (1.6.0_22)
> Jenkins version: 1.414-SNAPSHOT
> What happens:
> After running for about 1 day the following appears on jenkins log
> file:
> [Winstone 2011/05/27 07:35:03] - WARNING: Request handler pool limit exceeded 
> - waiting for retry
> and a bit later (this starts repeating):
> [Winstone 2011/05/27 07:43:25] - WARNING: Request handler pool limit exceeded 
> - waiting for retry
> [Winstone 2011/05/27 07:43:26] - ERROR: Request ignored because there were no 
> more request handlers available in the pool
> [Winstone 2011/05/27 07:43:36] - WARNING: Request handler pool limit exceeded 
> - waiting for retry
> [Winstone 2011/05/27 07:43:37] - ERROR: Request ignored because there were no 
> more request handlers available in the pool
> Jenkins then stops handling requests successfully - at the beginning
> intermittently, but finally basically failing almost all of the
> requests.
> Using VisualVM I can see that there is a thousand RequestHandlerThread
> threads in wait state, and that over 1200 file descriptors are
> currently in use.
> I think the requests start failing because winstone has a this limit:
>     private int MAX_REQUEST_HANDLERS_IN_POOL = 1000;
> as it doesn't seem to be running out of available fds (apparently 8192
> is the maximum in this setup).
> When I restart jenkins I can verify a slow buildup of threads and used
> file descriptors:
> * 10 minutes after restart: 136 live threads, 256 fds used
> * 20 minutes: 150 threads, 271 fds
> * 30 minutes: 161 threads, 280 fds
> * 110 minutes: 255 threads, 376 fds
> I've looked at the repository version of winstone, and looking at the
> code there seems to be a race condition in handling of the request
> handler pool.
> When a request is received by ObjectPool.handleRequest, it looks for
> an available request handler from unusedRequestHandlerThreads and
> calls commenceRequestHandling on the available thread.
> commenceRequestHandling in turn does this.notifyAll() to wake up the
> thread. So far so good. However when the thread has finished
> processing the request, it calls
> this.objectPool.releaseRequestHandler(this) and *then* waits. I think
> here's a race condition, since what can happen is that object pool
> called (CALL) and request handler thread (RH) can interleave like
> this:
> # RH (in RequestHandler.run): this.objectPool.releaseRequestHandler(this)
> # RH (in ObjectPool.releaseRequestHandler): 
> this.unusedRequestHandlerThreads.add(rh)
> # CALL (in ObjectPool.handleRequest): take RH from unusedRequestHandlerThreads
> # CALL (in ObjectPool.handleRequest): rh.commenceRequestHandling(socket, 
> listener);
> # CALL (in RequestHandler.commenceRequestHandling): this.notifyAll()
> # RH (in ObjectPool.run): this.wait()
> Since notify is lost (no waiters), this.wait() in the last step will
> hang forever. This will leak a file descriptor since the socket given
> to be processed is never reclaimed, and threads are effectively lost
> as Winstone will then create more RequestHandlers.
> Now, this is of course a winstone problem, but its development seems
> to be d-e-a-d at least looking at its bug tracker. As long as this
> problem affect Jenkins, I'd still classify it as a Jenkins problem too.
> I've put this into the winstone tracker too: 
> https://sourceforge.net/tracker/?func=detail&aid=3308285&group_id=98922&atid=622497
> Workaround: Use Tomcat, not embedded winstone (that's what I'm doing now).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to