I apologize for all the messages. But I thought this might also be useful info:
Normally, when I run: netstat -tn | egrep "^tcp.*:8009" on the web server and the app server, the connections match up. However, after I start getting the errors below, they seem to not. For example, I currently have the following on the web server: [EMAIL PROTECTED] access]# netstat -tn | egrep "^tcp.*:8009" tcp 0 0 10.0.0.212:53768 10.0.0.9:8009 ESTABLISHED tcp 0 0 10.0.0.212:53772 10.0.0.9:8009 ESTABLISHED [EMAIL PROTECTED] ~]$ netstat -tn | egrep "^tcp.*:8009" tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53772 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53768 ESTABLISHED tcp 0 3084 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53181 ESTABLISHED tcp 0 1660 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53180 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53182 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53177 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53176 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53179 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53178 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53173 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53174 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53169 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53168 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53170 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53165 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53167 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53166 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53161 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53163 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53157 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53156 ESTABLISHED tcp 0 3108 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53158 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53152 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53155 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53154 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53149 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53148 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53151 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53150 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53145 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53146 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53141 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53143 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53129 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53184 ESTABLISHED tcp 0 0 ::ffff:10.0.0.9:8009 ::ffff:10.0.0.212:53187 ESTABLISHED Thanks, -Michael But I have the following on the app server: --- michael thomas <[EMAIL PROTECTED]> wrote: > We actually are seeing error messages in the mod_jk > logs now: > > [Sun Feb 26 01:02:52 2006] [12249:43360] [error] > ajp_connection_tcp_get_message::jk_ajp_common.c > (961): > Can't receive the response message from tomcat, > network problems or tomcat is down (10.0.0.9:8009), > err=-113 > > [Sun Feb 26 01:02:52 2006] [12249:43360] [error] > ajp_get_reply::jk_ajp_common.c (1503): Tomcat is > down > or refused connection. No response has been sent to > the client (yet) > > These are recoverable errors, so I'm not sure if > they're responsible for the problems we're having, > but > they sure don't look good. > > We have no firewall between Apache and Tomcat. > MaxRequestsPerChild is 0 > And we haven't set the recycle_timeout or > socket_timeout in workers.properties, so they should > be non-issues. > > Also, I have seen that in the last few months, there > have been quite a few threads on this, with no real > apparent solution. > > I'm wondering if the thread posters found a > solution. > > We've also seen messages like this: > > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_service::jk_ajp_common.c (1749): Sending request > to tomcat failed, recoverable operation attempt=1 > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_send_request::jk_ajp_common.c (1178): Socket 19 > is > not connected any more (errno=-1) > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_send_request::jk_ajp_common.c (1202): Error > sending request. Will try another pooled connection > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_send_request::jk_ajp_common.c (1225): All > endpoints are disconnected or dead > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_service::jk_ajp_common.c (1749): Sending request > to tomcat failed, recoverable operation attempt=2 > [Sun Feb 26 01:04:16 2006] [12248:2400] [debug] > jk_open_socket::jk_connect.c (328): socket > TCP_NODELAY > set to On > [Sun Feb 26 01:04:16 2006] [12248:2400] [debug] > jk_open_socket::jk_connect.c (426): trying to > connect > socket 7 to 10.0.0.9:8009 > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > jk_open_socket::jk_connect.c (444): connect to > 10.0.0.9:8009 failed with errno=111 > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_connect_to_endpoint::jk_ajp_common.c (889): > Failed > opening socket to (10.0.0.9:8009) with (errno=111) > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_send_request::jk_ajp_common.c (1248): Error > connecting to the Tomcat process. > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > ajp_service::jk_ajp_common.c (1749): Sending request > to tomcat failed, recoverable operation attempt=3 > [Sun Feb 26 01:04:16 2006] [12248:2400] [error] > ajp_service::jk_ajp_common.c (1758): Error > connecting > to tomcat. Tomcat is probably not started or is > listening on the wrong port. worker=app01 failed > [Sun Feb 26 01:04:16 2006] [12248:2400] [debug] > ajp_done::jk_ajp_common.c (2074): recycling > connection > cache slot=2 for worker app01 > [Sun Feb 26 01:04:16 2006] [12248:2400] app01 > www.mysite.com 802.465676 POST /index.jsp > > [Sun Feb 26 01:04:16 2006] [12248:2400] [info] > jk_handler::mod_jk.c (1985): Service error=0 for > worker=app01 > [Sun Feb 26 01:04:18 2006] [12241:60352] [debug] > jk_cleanup_shmem::mod_jk.c (1761): Shmem cleanup > > > Thanks, > -Michael > > > > --- michael thomas <[EMAIL PROTECTED]> wrote: > > > Hello, > > > > I posted a couple of other threads with no > replies. > > I > > realize this is a complicated problem, but I would > > really, really be thankful for any help. > > > > Though I seem to have gotten rid of some of the > > previous problems I was having with mod_jk > requests > > getting frozen (by getting rid of the > > recycle_timeout, > > socket_keepalive and socket_timeout settings), I'm > > still having problems. > > > > PROBLEM SCENARIO: > > 1. Client has its requests go through for a while. > > > (Apache keepalive is true, so it keeps open a > > connection.) > > 2. Client is idle for a while. > > 3. Client's next request after being idle for a > > while > > SOMETIMES gets stuck after we redirect the > client's > > request. > > > > Below is a mod_jk log of the stuck redirect > request. > > > > We receive it, but it seems to be getting stuck in > > mod_jk on "request body to send 0 - request body > to > > resend 0". (Though we've had it get stuck in > other > > places, like "maintaining worker".) > > > > If the client then resubmits the EXACT same > request, > > it goes through. > > > > As I mentioned in the other two threads over the > > last > > few days, the client first does a POST, then the > > server redirects the POST to a GET. The POST > always > > goes through. The GET sometimes gets stuck. > > > > Here's the end of the client's POST (which is to > > tell > > it to redirect): > > ##################################### > > <SNIP> > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_connection_tcp_get_message::jk_ajp_common.c > > (1037): 0070 4F 2D 38 38 35 39 2D 31 00 00 0E > 43 > > 6F > > 6E 74 65 - O-8859-1...Conte > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_connection_tcp_get_message::jk_ajp_common.c > > (1037): 0080 6E 74 2D 4C 65 6E 67 74 68 00 00 > 01 > > 30 > > 00 00 00 - nt-Length...0... > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_unmarshal_response::jk_ajp_common.c (621): > > status > > = 302 > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_unmarshal_response::jk_ajp_common.c (628): > > Number > > of headers is = 3 > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_unmarshal_response::jk_ajp_common.c (684): > > Header[0] [Location] = > > [http://www.sitemadness.com/index.jsp] > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_unmarshal_response::jk_ajp_common.c (684): > > Header[1] [Content-Type] = > > [text/html;charset=ISO-8859-1] > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_unmarshal_response::jk_ajp_common.c (684): > > Header[2] [Content-Length] = [0] > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_connection_tcp_get_message::jk_ajp_common.c > > (1037): received from ajp13 pos=0 len=2 max=8192 > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_connection_tcp_get_message::jk_ajp_common.c > > (1037): 0000 05 01 00 00 00 00 00 00 00 00 00 > 00 > > 00 > > 00 00 00 - ................ > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > > ajp_done::jk_ajp_common.c (2074): recycling > > connection > > cache slot=1 for worker app01 > > [Sat Feb 25 17:27:27 2006] [19074:14688] app01 > > www.sitemadness.com 0.008015 POST /index.jsp > > > > [Sat Feb 25 17:27:27 2006] [19074:14688] [debug] > === message truncated === __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]