Mark,
On 10/31/25 4:04 AM, Mark Thomas wrote:
On 30/10/2025 14:06, Christopher Schultz wrote:
<snip/>
That symptom, plus the "this is the only server using a NAT gateway"
would surely point to one place: the NAT gateway is killing
connections that are idle and surprising both stunnel and mod_jk. I
can also see a graph of non-zero numbers of "Idle Timeouts" on the NAT
gateway. It doesn't tell me more details about those timeouts, but
they are almost certainly outgoing AJP/stunnel connections.
Your reasoning above looks sound to me.
Thanks for the second brain's opinion on this.
But.
Here is my mod_jk workers configuration:
# Template worker
worker.template.type=ajp13
worker.template.host=localhost
worker.template.connection_pool_timeout=60
worker.template.socket_timeout=300
worker.template.max_packet_size=65536
worker.node1.reference=worker.template
worker.node1.port=7015
worker.node1.route=node1
My expectation is that connection_pool_timeout of 60 (seconds) will
close connections which have been idle for 60 seconds. If mod_jk
closes a connection, stunnel will also close that connection. (Note: I
have no explicit connectionTimeout or keepAliveTimeout on the Tomcat
side. But this doesn't seem to be any problem for the other two web
servers.)
Checking my configuration for the NAT gateway, it has a fixed idle
timeout of 350 seconds, which is much longer than the 60 seconds I
(believe I) have set for idle AJP connections.
I do not use servlet async or Websocket for anything in my
application, so I do not expect long-lasting connections between
client and server.
Is there anything I haven't checked at this point?
You might want to check how connection_pool_timeout interacts with
connection_pool_size and connection_pool_minsize.
My current connection_pool_size and connection_pool_minsize are the
defaults, so I suspect they will be:
connection_pool_size = ThreadsPerChild = 25
connection_pool_minsize = 25 + 1 / 2 = 13
So, basically, default everything.
Looking at ps, I see that there are 6 httpd processes running as the
apache user, and one running as root (the control process).
The mod_status page says I have 124 idle workers and 1 in-flight
request. So believe I do in fact have the default configuration of 25
threads per process; with 6 processes I have 125 total threads.
netstat tells me that I currently have 112 ESTABLISHED connections owned
by stunnel.
I am wondering if the current size is at minsize will an idle connection
be closed if it exceeds the timeout?
I think the code that handles this is here:
https://github.com/apache/tomcat-connectors/blob/main/native/common/
jk_ajp_common.c#L3510
>
and I think it only closes idle connections until the pool reaches
> minsize.
I'll have to do a lot of reading to determine how that's going to
behave. So many configuration properties have changed names over time,
it's not 100% clear how the code will behave, but ...
https://github.com/apache/tomcat-connectors/blob/main/native/common/jk_ajp_common.c#L3567
This certainly seems like it will only "clean" connections until it gets
to the min size.
Which isn't what the docs suggest. Which in turn might explain why you
are seeing connections open longer than the NAT gateway timeout.
Rainer understands this code far better than I do. It would be good to
get his view on this. If I am right, I think we either need to update
the docs or we need to fix the code so idle connections below minsize
are closed and then re-opened to refresh them.
So it looks like I have a few options for immediate relief, here, though
if there is a bug (or missing feature) in mod_jk, then I think that's
the best long-term solution for me: I'd like AJP connections that have
been open for "too long" to simply be recycled.
1. Set connection_pool_minsize=0. This seems ... potentially problematic?
2. Set socket_keepalive on the worker. The documentation is not
encouraging, as it suggests that the keepalives may be send on the order
of hours and not minutes.
3. Use stunnel's TIMEOUTidle=300 (or similar) to get stunnel to kill
idle connections after too much time has passed (but less than the NAT
idle timeout). I think this just moves the problem from the NAT router
surprising stunnel and mod_jk to stunnel surprising mod_jk.
4. Use stunnel's keepalive capabilities.
socket = l:SO_KEEPALIVE=1
socket = r:SO_KEEPALIVE=1
I believe this is similar to mod_jk's keepalives, which rely on the OS.
Modifying the global OS settings for how keepalives behave doesn't feel
right, so I'm going to drop these two options from my list.
5. Set Tomcat's keepAliveTimeout instead of leaving it at the default. I
think this will close the connection but mod_jk won't know about it
until it attempts another write. (Right?)
4. Use ping_mode=P or ping_mode=I with an appropriate
connection_ping_interval and ping_timeout. This seems the most
promising, because it will probe the connection before using it.
This kind of happens already, because when mod_jk attempts to use a
connection and discovers it's unusable, it will roll-over to another
connection. But I think it will only do it 3 times before it (1) gives
up and (2) marks the worker as ERR.
If I use ping_mode=P or I, will this change the above behavior at all? I
would like mod_jk in this case to ping each connection, discover it's
unusable, and recycle it but not mark the worker as being in the ERR
state, even if mod_jk needs to recycle a whole bunch of connections in
order to find one that works.
Are there any other options I haven't identified above?
I have other nuclear options, including allocating a public IP address
for this instance and using that directly. My goal was actually to push
everything behind this NAT router for a couple of reasons, so I would
like to figure out how to get it working.
As the subject says, I do have 2 other web servers, so I can experiment
with this one, including putting it in debug/trace log mode, removing it
from the hw load-balancer, poking it with traffic only generated by me, etc.
-chris
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]