Hello, We are experiencing some weird problems with linux + tomcat + apache + mod_jk + ssl. We have a loadbalancing solution which is working for a couple of hours (about 5) and than it just stops responding. After playing with the workers properties a bit (adding socket_timeout, cache_timeout, recycle_timeout) the application stops for a bit and after few minutes starts working again.
I was thinking it could be because of some socket limitations which gets exhausted and causes the application to freeze for a while. It's difficult to collect any usefull information as these things are happening about once, maximaly twice a day. There are about 10 users testing it. tomcat is 5.5.12, apache 2.0.52 mod_jk-1.2.15 Before I set up the timeouts this is what was in the mod_jk.log, after that the logs are empty. [Mon Nov 28 10:29:53 2005] [error] ajp_service::jk_ajp_common.c (1758): Error connecting to tomcat. Tomcat is probably not started or is listening on the wrong port. worker=ajp132 failed [Tue Nov 29 07:05:26 2005] [error] ajp_get_reply::jk_ajp_common.c (1503): Tomcat is down or refused connection. No response has been sent to the client (yet) [Tue Nov 29 07:05:26 2005] [error] ajp_get_reply::jk_ajp_common.c (1503): Tomcat is down or refused connection. No response has been sent to the client (yet) [Tue Nov 29 07:05:26 2005] [error] ajp_service::jk_ajp_common.c (1715): receiving reply from tomcat failed without recovery in send loop 0 [Tue Nov 29 07:05:26 2005] [error] ajp_service::jk_ajp_common.c (1715): receiving reply from tomcat failed without recovery in send loop 0 [Tue Nov 29 07:05:26 2005] [error] service::jk_lb_worker.c (687): unrecoverable error 502, request failed. Tomcat failed in the middle of request, we can't recover to another instance. And here is my workers.properties worker.list=loadbalancer #, ajp131, ajp132 worker.loadbalancer.type=lb worker.loadbalancer.balance_workers=ajp131, ajp132 worker.loadbalancer.sticky_session=True # Workers worker.ajp131.port=8009 worker.ajp131.host=127.0.0.1 worker.ajp131.type=ajp13 worker.ajp131.lbfactor=1 #worker use up to 1 sockets, which will stay no more than 10mn in cache worker.ajp131.cachesize=1 worker.ajp131.cache_timeout=600 worker.ajp131.socket_timeout=180 #worker ask operating system to send KEEP-ALIVE signal on the connection worker.ajp131.socket_keepalive=1 #worker want ajp13 connection to be dropped after 5mn (recycle) worker.ajp131.recycle_timeout=300 worker.ajp132.port=8009 worker.ajp132.host=10.241.154.124 worker.ajp132.type=ajp13 worker.ajp132.lbfactor=2 #worker use up to 1 sockets, which will stay no more than 10mn in cache worker.ajp132.cachesize=1 worker.ajp132.cache_timeout=600 worker.ajp132.socket_timeout=180 #worker ask operating system to send KEEP-ALIVE signal on the connection worker.ajp132.socket_keepalive=1 #worker want ajp13 connection to be dropped after 5mn (recycle) worker.ajp132.recycle_timeout=300 Any ideas for solutions or pointers how to debug this whole damn stuff are wery welcomed. Thanks a lot.