Hi all, I have good news as I have identified the reason for the devastating NioEndpoint.Poller thread death:
In rare circumstances a ConcurrentModification can occur in the Poller's connection timeout handling called from OUTSIDE the try-catch(Throwable) of Poller.run() java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922) at java.util.HashMap$KeyIterator.next(HashMap.java:956) at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1067) at org.apache.tomcat.util.net.NioEndpoint$Poller.timeout(NioEndpoint.java:1437) at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1143) at java.lang.Thread.run(Thread.java:745) Somehow the Poller's Selector object gets modified from another thread. As a remedy until fixed properly by the Tomcat team, I have added a try-catch(ConcurrentModificationException) surrounding the for loop in Poller.timeout(). That way, in case of the rare problem, a full iteration of the Selector will be retried in the next call to Poller.timeout(). I am really happy now as all our production servers have been rock stable for two weeks now. Best regards to all, Lars Engholm Johansen On Thu, Sep 18, 2014 at 7:03 PM, Filip Hanik <fi...@hanik.com> wrote: > Thanks Lars, if you are indeed experiencing a non caught error, let us know > what it is. > > On Thu, Sep 18, 2014 at 2:30 AM, Lars Engholm Johansen <lar...@gmail.com> > wrote: > > > Thanks guys for all the feedback. > > > > I have tried the following suggested tasks: > > > > - Upgrading Tomcat to the newest 7.0.55 on all our servers -> Problem > > still persists > > - Force a System.gc() when connection count is on the loose -> > > Connection count is not dropping > > - Lowering the log level of NioEndpoint class that contains the Poller > > code -> No info about why the poller thread exits in any tomcat logs > > - Reverting the JVM stack size per thread to the default is discussed > > previously -> Problem still persists > > > > I have now checked out the NioEndpoint source code and recompiled it > with a > > logging try-catch surrounding the whole of the Poller.run() > implementation > > as I noticed that the outer try-catch here only catches OOME. > > I will report back with my findings as soon as the problem arises again. > > > > /Lars > > > > > > > > On Fri, Jun 27, 2014 at 9:02 PM, Christopher Schultz < > > ch...@christopherschultz.net> wrote: > > > > > -----BEGIN PGP SIGNED MESSAGE----- > > > Hash: SHA256 > > > > > > Filip, > > > > > > On 6/27/14, 11:36 AM, Filip Hanik wrote: > > > > Are there any log entries that would indicate that the poller > > > > thread has died? This/these thread/s start when Tomcat starts. and > > > > a stack over flow on a processing thread should never affect the > > > > poller thread. > > > > > > OP reported in the initial post that the thread had disappeared: > > > > > > On 6/16/14, 5:40 AM, Lars Engholm Johansen wrote: > > > > We have no output in tomcat or our logs at the time when this event > > > > occurs. The only sign is when comparing full java thread dump with > > > > a dump from a newly launched Tomcat: > > > > > > > > One of http-nio-80-ClientPoller-0 or http-nio-80-ClientPoller-1 > > > > is missing/has died. > > > > > > - -chris > > > -----BEGIN PGP SIGNATURE----- > > > Version: GnuPG v1 > > > Comment: GPGTools - http://gpgtools.org > > > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > > > > > iQIcBAEBCAAGBQJTrb+yAAoJEBzwKT+lPKRYhYEP/05kiei/EUFhtxL6RMIl70Ok > > > cb3I9XEvrQDBTkEDnGLvxw8MQSs6ocHaxdEOxzie289sYxvkuLWxOsKpikWkuUHH > > > pEgHM5WuGuCS2AmcrTGiH6WPCnNAj8YM/zyx25NZOn8turWIbvh8GRzBFf265qP5 > > > 79z2Vb15NisYyNEqvkWHvli5CeDeOW2fgHcgv5Ec5fWb1/KyXAyVtRmEWnHpy/LB > > > j/VLjzbBtFSJGT64W4i572qQ7C+f/XRgNzV6Fh/53gwPf+ggz5vKS9XEQEpa5SOz > > > rlTrWuVs+WehBoCLE9TZB2J+argV7noqSQDumYcXeSf/4THkfhbhAlcBKXa/YLgH > > > Paip710VV6S+9K1dAZOt4i1h28YXZ+qNviO6b/auo1DEdt21ezpklEOQyZbQcHYf > > > H4VZ2mcSaMQo3QpWpze6QxvSsRZFAofpkLoqCRfsORlnV2c2xfjhRC1YtZ0sshfM > > > zNnWQCEjRe5V+UB69mtjatJrDG16qjTcUZQlot3r4zxdjMq5D0W9XmC6WH2eCXhl > > > aeH8SMISdn4GcYGMoUm7hWSWHs5azyBPma9AWJfYC+mLk8UbmvLP9gZN+KWenWOr > > > xLiqCgMUvpLiOFsbNs8oWMDWGW59xT2zBjS3Aa20ZYJP/GeLWJkOrAPwTeqIaXG+ > > > tV1WjkDkejPrC4WWKwzm > > > =sTia > > > -----END PGP SIGNATURE----- > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > > For additional commands, e-mail: users-h...@tomcat.apache.org > > > > > > > > >