Re: Connection count explosion due to thread http-nio-80-ClientPoller-x death

Filip Hanik Thu, 19 Jun 2014 17:49:48 -0700

"Our sites still functions normally with no cpu spikes during this build up
until around 60,000 connections, but then the server refuses further
connections and a manual Tomcat restart is required."


yes, the connection limit is a 16 bit short count minus some reserved
addresses. So your system should become unresponsive, you've run out of
ports (the 16 bit value in a TCP connection).

netstat -na should give you your connection state when this happens, and
that is helpful debug information.

Filip




On Thu, Jun 19, 2014 at 2:44 PM, André Warnier <a...@ice-sa.com> wrote:

> Konstantin Kolinko wrote:
>
>> 2014-06-19 17:10 GMT+04:00 Lars Engholm Johansen <lar...@gmail.com>:
>>
>>> I will try to force a GC next time I am at the console about to restart a
>>> Tomcat where one of the http-nio-80-ClientPoller-x threads have died and
>>> connection count is exploding.
>>>
>>> But I do not see this as a solution - can you somehow deduct why this
>>> thread died from the outcome from a GC?
>>>
>>
>> Nobody said that a thread died because of GC.
>>
>> The GC that Andre suggested was to get rid of some of CLOSE_WAIT
>> connections in netstat output, in case if those are owned by some
>> abandoned and non properly closed I/O classes that are still present
>> in JVM memory.
>>
>
> Exactly, thanks Konstantin for clarifying.
>
> I was going per the following in the original post :
>
> "Our sites still functions normally with no cpu spikes during this build up
> until around 60,000 connections, but then the server refuses further
> connections and a manual Tomcat restart is required."
>
> CLOSE_WAIT is a normal state for a TCP connection, but it should not
> normally last long.
> It indicates basically that the other side has closed the connection, and
> that this side should do the same. But it doesn't, and as long as it
> doesn't the connection remains in the CLOSE_WAIT state.  It's like
> "half-closed", but not entirely, and as long as it isn't, the OS cannot get
> rid of it.
> For a more precise explanation, Google for "TCP CLOSE_WAIT state".
>
> I have noticed in the past, with some Linux versions, that when the number
> of such CLOSE_WAIT connections goes above a certain level (several
> hundred), the TCP/IP stack can become totally unresponsive and not accept
> any new connections at all, on any port.
> In my case, this was due to the following kind of scenario :
> Some class Xconnection instantiates an object, and upon creation this
> object opens a TCP connection to something. This object is now used as an
> "alias" for this connection.  Time passes, and finally the object goes out
> of scope (e.g. the reference to it is set to "null"), and one may believe
> that the underlying connection gets closed as a side-effect.  But it
> doesn't, not as long as this object is not actually garbage-collected,
> which triggers the actual object destruction and the closing of the
> underlying connection.
> Forcing a GC is a way to provoke this (and restarting Tomcat another, but
> more drastic).
>
> If a forced GC gets rid of your many CLOSE_WAIT connections and makes your
> Tomcat operative again, that would be a sign that something similar to the
> above is occurring; and then you would need to look in your application for
> the oversight. (e.g. the class should have a "close" method (closing the
> underlying connection), which should be invoked before letting the object
> go out of scope).
>
> The insidious part is that everything may look fine for a long time (apart
> from an occasional long list of CLOSE_WAIT connections).  A GC will happen
> from time to time (*), which will get rid of these connections.  And those
> CLOSE_WAIT connections do not consume a lot of resources, so you'll never
> notice.
> Until at some point, the number of these CLOSE_WAIT connections gets just
> at the point where the OS can't swallow any more of them, and then you have
> a big problem.
>
> That sounds a bit like your case, doesn't it ?
>
> (*) and this is the "insidious squared" part : the smaller the Heap, the
> more often a GC will happen, so the sooner these CLOSE_WAIT connections
> will disappear.  Conversely, by increasing the Heap size, you leave more
> time between GCs, and make the problem more likely to happen.
>
>
> I believe that the rest below may be either a consequence, or a red
> herring, and I would first eliminate the above as a cause.
>
>
>
>>  And could an Exception/Error in Tomcat thread  http-nio-80-ClientPoller-0
>>>  or  http-nio-80-ClientPoller-1  make the thread die with no Stacktrace
>>> in
>>> the Tomcat logs?
>>>
>>>
>> A critical error (java.lang.ThreadDeath,
>> java.lang.VirtualMachineError) will cause death of a thread.
>>
>> A subtype of the latter is java.lang.OutOfMemoryError.
>>
>> As of now, such errors are passed through and are not logged by
>> Tomcat, but are logged by java.lang.ThreadGroup.uncaughtException().
>> ThreadGroup prints them to System.err (catalina.out).
>>
>>
>> Best regards,
>> Konstantin Kolinko
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Re: Connection count explosion due to thread http-nio-80-ClientPoller-x death

Reply via email to