DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=36281>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ· INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=36281 Summary: problem with Failover in jk_lb_worker.c Product: Tomcat 5 Version: 5.0.30 Platform: Sun OS/Version: Solaris Status: NEW Severity: normal Priority: P2 Component: Native:JK AssignedTo: tomcat-dev@jakarta.apache.org ReportedBy: [EMAIL PROTECTED] Apache 1.3.33 Tomcat 5.0.30 mod_jk 1.2.14 Solaris 9 Hi, There seems to be a problem in supporting Failover of a failed Tomcat server on Solaris 9 with mod_jk 1.2.14. When I simulate the failed Tomcat server by pulling the network cable, the worker will do all the appropriate socket_timeouts and retries, but when that fails, as it should, another worker is not chosen, and the failed worker is not put in_error. Here is a snippet from the log: [trace] service::jk_lb_worker.c (551): enter [trace] get_most_suitable_worker::jk_lb_worker.c (453): enter [debug] get_most_suitable_worker::jk_lb_worker.c (539): found best worker (giraffe) using by request method [trace] get_most_suitable_worker::jk_lb_worker.c (543): exit [debug] service::jk_lb_worker.c (587): service worker=giraffe jvm_route=giraffe [trace] ajp_service::jk_ajp_common.c (1630): enter [debug] ajp_service::jk_ajp_common.c (1670): processing with 3 retries [info] ajp_service::jk_ajp_common.c (1749): Sending request to tomcat failed, recoverable operation attempt=1 [info] ajp_service::jk_ajp_common.c (1749): Sending request to tomcat failed, recoverable operation attempt=2 [info] ajp_service::jk_ajp_common.c (1749): Sending request to tomcat failed, recoverable operation attempt=3 [error] ajp_service::jk_ajp_common.c (1758): Error connecting to tomcat. Tomcat is probably not started or is listening on the wrong port. worker=giraffe failed [trace] ajp_service::jk_ajp_common.c (1768): exit <log ends> The ajp_service::jk_ajp_common.c method is called by the service::jk_lb_worker.c method, but there are no more log messages after ajp_service returns. We should at least see an exit trace log for the service method. I put in logging statements in the code, and found the offending lines. >From jk_lb_worker.c ln. 603-607: service_stat = end->service(end, s, l, &is_service_error); /* IT IS ONE OF THESE TWO LINES THAT CAUSES THE THREAD TO DIE OR HANG */ rec->s->readed += end->rd; rec->s->transferred += end->wr; end->done(&end, l); A logging message directly after the end->service method is called is seen, but one right before the end->done method is called is not. In any case, if I comment out the two lines that update the shared memory, everything works as expected. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]