DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36281>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36281

           Summary: problem with Failover in jk_lb_worker.c
           Product: Tomcat 5
           Version: 5.0.30
          Platform: Sun
        OS/Version: Solaris
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Native:JK
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: [EMAIL PROTECTED]


Apache 1.3.33
Tomcat 5.0.30
mod_jk 1.2.14
Solaris 9

Hi,

There seems to be a problem in supporting Failover of a failed Tomcat server on 
Solaris 9 with mod_jk 1.2.14.  When I simulate the failed Tomcat server by 
pulling the network cable, the worker will do all the appropriate 
socket_timeouts and retries, but when that fails, as it should, another worker 
is not chosen, and the failed worker is not put in_error.  Here is a snippet 
from the log:

[trace] service::jk_lb_worker.c (551): enter
[trace] get_most_suitable_worker::jk_lb_worker.c (453): enter
[debug] get_most_suitable_worker::jk_lb_worker.c (539): found best worker 
(giraffe) using by request method
[trace] get_most_suitable_worker::jk_lb_worker.c (543): exit
[debug] service::jk_lb_worker.c (587): service worker=giraffe jvm_route=giraffe
[trace] ajp_service::jk_ajp_common.c (1630): enter
[debug] ajp_service::jk_ajp_common.c (1670): processing with 3 retries
[info]  ajp_service::jk_ajp_common.c (1749): Sending request to tomcat failed,  
recoverable operation attempt=1
[info]  ajp_service::jk_ajp_common.c (1749): Sending request to tomcat failed,  
recoverable operation attempt=2
[info]  ajp_service::jk_ajp_common.c (1749): Sending request to tomcat failed,  
recoverable operation attempt=3
[error] ajp_service::jk_ajp_common.c (1758): Error connecting to tomcat. Tomcat 
is probably not started or is listening on the wrong port. worker=giraffe failed
[trace] ajp_service::jk_ajp_common.c (1768): exit
<log ends>

The ajp_service::jk_ajp_common.c method is called by the 
service::jk_lb_worker.c method, but there are no more log messages after 
ajp_service returns.  We should at least see an exit trace log for the service 
method.  I put in logging statements in the code, and found the offending lines.

>From jk_lb_worker.c ln. 603-607:

service_stat = end->service(end, s, l, &is_service_error);
/* IT IS ONE OF THESE TWO LINES THAT CAUSES THE THREAD TO DIE OR HANG */
rec->s->readed += end->rd;
rec->s->transferred += end->wr;
end->done(&end, l);

A logging message directly after the end->service method is called is seen, but 
one right before the end->done method is called is not.
In any case, if I comment out the two lines that update the shared memory, 
everything works as expected.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to