On Feb 7, 2014, at 2:38 AM, Mark Thomas <ma...@apache.org> wrote:

> Jesse Barnum <jsb_tom...@360works.com> wrote:
> 
> Thanks for such a well written question. All the relevant information is 
> available and presented clearly and logically.

Glad I could help. I get error reports from my users all the time like "I 
installed the update and now it doesn't work", so I know how frustrating that 
can be :-)

> 
>> Problem summary:
>> My nio polling threads are using too much CPU time.
> 
> Are you sure that is the real problem? It sounds like the occasional slowness 
> is the problem and the polling threads are the suspected - emphasis on the 
> suspected - root cause.

That's true. I've tried running it with a profiler (YourKit) during peak time, 
and I think I've found the real culprit - reading from the Coyote InputStream 
is blocking.

Out of 98,515 milliseconds spent in javax.servlet.http.HttpServlet.service:
* 95,801 milliseconds are spent calling 
org.apache.catalina.connector.CoyoteInputStream.read( byte[] ). Of this time, 
81,707 ms are spent waiting for KeyAttachment.awaitLatch.
* 2,698 milliseconds are spent on business logic and MySQL calls
* 16 milliseconds are spent closing the input stream and setting HTTP headers

The last Tomcat code I see is 
org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch( 
CountDownLatch, long, TimeUnit ). I don't really know what that means or does, 
but it seems to be the holdup.

If you want me to send you the raw profiler information, let me know how to 
send it and I'll be happy to (although I'm running an old version of YourKit, 
9.5.6). I can also send screen shots if that's helpful / easier.

Here is a typical thread dump while the request is being serviced:
> Thread name: ajp-nio-8009-exec-19 / ID: 55 / state: TIMED_WAITING 
> Milliseconds spent waiting: 47,475,371 / blocked: 2,389 
>     sun.misc.Unsafe.$$YJP$$park(Native Method) 
>     sun.misc.Unsafe.park(Unsafe.java) 
>     java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) 
>     
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>  
>     
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>  
>     java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282) 
>     
> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1571)
>  
>     
> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1573)
>  
>     
> org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:172)
>  
>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:246) 
>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:227) 
>     
> org.apache.coyote.ajp.AjpNioProcessor.readSocket(AjpNioProcessor.java:342) 
>     org.apache.coyote.ajp.AjpNioProcessor.read(AjpNioProcessor.java:314) 
>     
> org.apache.coyote.ajp.AjpNioProcessor.readMessage(AjpNioProcessor.java:406) 
>     org.apache.coyote.ajp.AjpNioProcessor.receive(AjpNioProcessor.java:375) 
>     
> org.apache.coyote.ajp.AbstractAjpProcessor.refillReadBuffer(AbstractAjpProcessor.java:616)
>  
>     
> org.apache.coyote.ajp.AbstractAjpProcessor$SocketInputBuffer.doRead(AbstractAjpProcessor.java:1070)
>  
>     org.apache.coyote.Request.doRead(Request.java:422) 
>     
> org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:290) 
>     org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:431) 
>     org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:315) 
>     
> org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:167)
>  
>     com.prosc.io.IOUtils.writeInputToOutput(IOUtils.java:49) 
>     com.prosc.io.IOUtils.inputStreamAsBytes(IOUtils.java:116) 
>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:136) 
>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:127) 
>     com.prosc.licensecheck.LicenseCheck.doPost(LicenseCheck.java:169) 
>     javax.servlet.http.HttpServlet.service(HttpServlet.java:647) 
>     javax.servlet.http.HttpServlet.service(HttpServlet.java:728) 
>     
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
>  
>     
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>  
>     com.prosc.infrastructure.LogFilter.doFilter(LogFilter.java:22) 
>     
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>  
>     
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>  
>     
> com.prosc.infrastructure.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:38)
>  
>     
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>  
>     
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>  
>     
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>  
>     
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>  
>     
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
>  
>     
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
>     
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) 
>     
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>  
>     
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
>     org.apache.coyote.ajp.AjpNioProcessor.process(AjpNioProcessor.java:184) 
>     
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>  
>     
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1680)
>  
>     
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
>     
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
>     java.lang.Thread.run(Thread.java:722)

So it seems that the worker thread is blocking on something while it's trying 
to read the POST data…? If my guess is correct, is this a problem that can be 
solved by reconfiguring?

> 
>> Application overview:
>> My application has from 1,300 - 4,000 users connected at any given
>> time. Each user sends about 200 bytes, then waits 30 seconds, then
>> sends about 200 bytes, and this just loops for each user.
>> Each user connects with SSL, and we use a long keepalive to ensure that
>> the HTTP connection doesn't close, so that we don't have to renegotiate
>> SSL.
> 
> How long should the application take to process those 200 bytes? I'm 
> wondering what the peak number of concurrent requests, rather than 
> connections, might be.


You're welcome to take a look at the live Apache server status here:
http://venus.360works.com/server-status

At 7:30am this morning, it showed 1,733 connected users, and looking at the 
'Req' column which measures response time in milliseconds, it's typically in 
the 0-7 millisecond range, with some rare outliers in the hundreds of 
milliseconds.

At 12:07pm today, which is our peak time, for every 3-4 requests in the 0-8 
millisecond range, there is a request in the 150-700 millisecond range. 

>> 
>> Problem detail:
>> lsof is currently showing 564 open sockets between Apache and Tomcat on
>> port 8009, with 1,352 users connected to Apache.
>> The two threads consuming the most CPU time in Tomcat are
>> "NioBlockingSelector.BlockPoller-2 / 15" and
>> "ajp-nio-8009-ClientPoller-0 / 25". Between them, they are taking 20%
>> of all CPU time for the Tomcat process. I get a few times a day when
>> our monitoring software reports slow response times, and I'd like to
>> solve this.
> 
> How much actual processor time are those threads taking? The 20% is only 
> relative and it would be helpful to know what it is relative to.


The profiler is showing the selector threads as taking 264 milliseconds during 
the profiled period of time (1 minute, 26 seconds), so I agree with you; that's 
not the culprit.

> 
> Do you have access logs available for those times? The number of concurrent 
> requests would be a useful number. When working that out you need to be sure 
> if the time in the access log is the time the request started or the time it 
> finished.
> 
>> Some guesses at solutions:
>> I'm guessing that the high CPU usage is because they are polling all
>> 564 open sockets constantly? Would it make sense to reduce the number
>> of open sockets? I didn't configure any maximum and I don't know how to
>> reduce this number. I'm also concerned that reducing that might negate
>> any benefits by increasing the number of sockets opening and closing
>> between ajp_mod_proxy and the NIO AJP connector.
> 
> Some rough numbers.
> With a peak of 9 httpd children with 500 threads each - assume at least 4000 
> connected clients.
> With a request every 30 seconds that is roughly 133 requests a second 
> assuming they are perfectly evenly distributed which they won't be.
> With 15 worker threads on Tomcat each one will be handling roughly 9 requests 
> a second, again assuming even distribution.
> That means a request has about 110ms to complete or you run the risk of 
> running out of threads. This includes the time to process the request and a 
> little overhead for Tomcat to recycle the thread.
> I wouldn't be surprised for peak loads to be at least 2-3 times higher due to 
> the uneven distribution of requests in time. That means requests have more 
> like 35-40ms to complete.

That sounds about right. The Apache status page is currently showing 153 
requests per second, with 2,912 connected users.

> 
> I suspect that you are running out of worker threads on the Tomcat side. 
> Increasing it from 15 to 30 wouldn't do any harm.

This is a single core box (sorry, should have mentioned that in the 
configuration details). Would you still expect increasing the worker thread 
count to help?

> 
> Of course, this is just guess work. You'd need to look at your access logs to 
> be sure.
> 
>> Maybe it's already running at optimal performance and I just need to
>> throw hardware at it, but it seems like a solvable problem, because the
>> actual worker threads are not doing much at all.
> 
> I agree it sounds like configuration at this point.
> 
> Mark
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 


--Jesse Barnum, President, 360Works
http://www.360works.com
Product updates and news on http://facebook.com/360Works
(770) 234-9293
== Don't lose your data! http://360works.com/safetynet/ for FileMaker Server ==
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to