On 16/01/18 09:21, Toom Andreas wrote: > Hi Mark, > > We pulled out a CPU Call Tree using JVisualVM instead of YourKit and I have > uploaded a screenshot of it here: https://imgur.com/a/mqYxn > > There is not much extra information compared to the java thread dump in my > initial post but it highlights in which code branch/method call most of the > cpu time is spent. Perhaps it will give some additional clues ? Let me know > if you think that it is worthwhile to use YourKit instead of JVisualVM and I > will try to get that for you instead.
Thanks. This has given me a couple of ideas. I need to do some more testing. I hope to be able to get this done in the next day or so. Mark > > Best regards, > Andreas > > -----Original Message----- > From: Mark Thomas [mailto:ma...@apache.org] > Sent: den 12 januari 2018 11:27 > To: Tomcat Users List <users@tomcat.apache.org> > Subject: Re: Ajp Nio-thread stuck in loop and consuming a lot of cpu > > On 12/01/18 06:58, Toom Andreas wrote: >> Hi, >> >> We are having an issue with an application running Apache Tomcat 8.0.47 >> using Oracle Jvm 1.8.0.151 on Linux (RHEL 7). The Tomcat process is >> consuming cpu at a constant high level even though there is a low amount of >> incoming traffic. When inspecting the process health using JMX /JVisualVM >> CPU Sampler we see that there are 4 ajp-nio-exec threads together with a >> NioBlockingSelector.BlockPoller thread that consume most of the cpu. > > Are you able to determine the code path of the loop? A single stack trace can > point us in the right direction but often the root cause can still be tricky > to track down. > > A profiler such as YourKit in CPU tracing mode should be able to provide > enough info to figure this out (and won't require any code changes to your > system). > > Remote debugging is the other option but that is less ideal in production. > > Mark > > >> >> A stack trace of one of the ajp-io-exec threads looks like this: >> >> "ajp-nio-48317-exec-14233" - Thread t@201195 >> java.lang.Thread.State: TIMED_WAITING >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <342fab60> (a >> java.util.concurrent.CountDownLatch$Sync) >> at >> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) >> at >> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) >> at >> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1400) >> at >> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1402) >> at >> org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:185) >> at >> org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:250) >> at >> org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:231) >> at >> org.apache.coyote.ajp.AjpNioProcessor.readSocket(AjpNioProcessor.java:194) >> at >> org.apache.coyote.ajp.AjpNioProcessor.read(AjpNioProcessor.java:160) >> at >> org.apache.coyote.ajp.AbstractAjpProcessor.readMessage(AbstractAjpProcessor.java:1091) >> at >> org.apache.coyote.ajp.AbstractAjpProcessor.process(AbstractAjpProcessor.java:804) >> at >> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684) >> at >> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1539) >> at >> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1495) >> - locked <279cc7f7> (a >> org.apache.tomcat.util.net.NioChannel) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at >> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) >> at java.lang.Thread.run(Thread.java:748) >> >> Locked ownable synchronizers: >> - locked <10e1f474> (a >> java.util.concurrent.ThreadPoolExecutor$Worker) >> >> A stack trace of the NioBlockingSelector.BlockPoller thread looks like this: >> >> "NioBlockingSelector.BlockPoller-2" - Thread t@17 >> java.lang.Thread.State: RUNNABLE >> at >> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) >> at >> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) >> at >> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) >> at >> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) >> - locked <648895e4> (a sun.nio.ch.Util$3) >> - locked <17921532> (a >> java.util.Collections$UnmodifiableSet) >> - locked <67762111> (a >> sun.nio.ch.EPollSelectorImpl) >> at >> sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:105) >> at >> org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBloc >> kingSelector.java:339) >> >> Locked ownable synchronizers: >> - None >> >> It seems like the worker thread is trying to read from the underlying >> socket. When we inspect the OS tcp/ip statistics using the ss-command (ss -i >> src :<ajp-port>) the following connections are suspicious: >> >> State Recv-Q Send-Q Local Address:Port Peer Address:Port >> CLOSE-WAIT 2 0 10.x.x.x:48317 10.x.x.11:53724 >> cubic wscale:7,7 rto:202 rtt:1.724/2.424 >> ato:40 mss:1448 cwnd:10 ssthresh:31 bytes_acked:343869507 >> bytes_received:22259372 segs_out:302687 segs_in:141943 send 67.2Mbps >> lastsnd:1290791171 lastrcv:1290786778 lastack:1290786767 pacing_rate >> 134.4Mbps retrans:0/433 rcv_rtt:1.875 rcv_space:28960 >> CLOSE-WAIT 1 0 10.x.x.x:48317 10.x.x.12:41990 >> cubic wscale:7,7 rto:202 rtt:1.132/0.899 >> ato:40 mss:1448 cwnd:10 ssthresh:22 bytes_acked:37347221 >> bytes_received:3346200 segs_out:35273 segs_in:20608 send 102.3Mbps >> lastsnd:289840681 lastrcv:289730802 lastack:289720796 pacing_rate 204.5Mbps >> retrans:0/13 reordering:24 rcv_rtt:1.875 rcv_space:28960 >> CLOSE-WAIT 1 0 10.x.x.x:48317 10.x.x.12:60068 >> cubic wscale:7,7 rto:209 rtt:8.757/14.317 >> ato:40 mss:1448 cwnd:10 ssthresh:36 bytes_acked:180903401 >> bytes_received:16401965 segs_out:170017 segs_in:100392 send 13.2Mbps >> lastsnd:395602314 lastrcv:395602267 lastack:395602257 pacing_rate 26.5Mbps >> retrans:0/106 rcv_rtt:1.875 rcv_space:28960 >> CLOSE-WAIT 2 0 10.x.x.x:48317 10.x.x.11:50628 >> cubic wscale:7,7 rto:203 rtt:2.992/4.252 >> ato:40 mss:1448 cwnd:13 ssthresh:25 bytes_acked:4185335 >> bytes_received:471220 segs_out:4193 segs_in:2181 send 50.3Mbps >> lastsnd:1036916135 lastrcv:1036871377 lastack:1036871367 pacing_rate >> 100.6Mbps retrans:0/1 reordering:26 rcv_rtt:6.375 rcv_space:28960 >> >> We are running our Tomcat node behind and Apache server and use Tomcat >> Connector (mod_jk) as a load balancer. When looking at the hosts running the >> Apache node the corresponding sockets are closed. To us it seems that Tomcat >> has not properly closed the underlying socket when Apache/mod_jk initiated a >> FIN and is instead continuously trying to read from the socket even though >> it has been closed on the Apache side ? >> >> I know that we are one version behind the latest release of Tomcat 8.0.x but >> I checked the changelog and I could not see any fixed issues matching this >> problem. Unfortunately we are only seeing this in production and we have not >> been able to reproduce in test. Has anyone experienced a similar issue like >> this before ? >> >> For reference our connector configuration in server.xml looks like this: >> >> <Connector port="48317" >> protocol="AJP/1.3" >> maxThreads="200" >> connectionTimeout="600000" >> xpoweredBy="false" >> allowTrace="false" >> URIEncoding="UTF-8" >> requiredSecret="xxx" >> secure="true" >> scheme="https" >> proxyPort="443" /> >> >> And our Tomcat Connector configuration for Apache in workers.properties >> looks like this: >> >> worker.list = status,app >> >> # Create workers for app >> worker.app-1.type=ajp13 >> worker.app-1.host=host8128.corpdomain.internal >> worker.app-1.port=48317 >> worker.app-1.ping_mode=A >> worker.app-1.connection_pool_timeout=600 >> worker.app-1.secret=xxx >> worker.app-2.type=ajp13 >> worker.app-2.host=host8129.corpdomain.internal >> worker.app-2.port=48317 >> worker.app-2.ping_mode=A >> worker.app-2.connection_pool_timeout=600 >> worker.app-2.secret=xxx >> >> # Create a loadbalancer for app >> worker.app.type=lb >> worker.app.sticky_session=1 >> worker.app.balance_workers=app-1, app-2 >> >> Best regards, >> Andreas >> >> >> Andreas Toom >> Systemutvecklare/IT >> TEL: +46 702 03 04 98 >> [Volvofinans Bank AB]<http://www.volvofinans.se/> >> >> >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org