----- Original Message ----- From: "Glenn Nielsen" <[EMAIL PROTECTED]> To: "Tomcat Developers List" <[EMAIL PROTECTED]> Sent: Wednesday, September 24, 2003 12:28 PM Subject: Re: mod_jk does not detect a hung Tomcat
> > > Henri Gomez wrote: > > David Rees a écrit : > > > >> Henri Gomez said: > >> > >>> Henri Gomez a écrit : > >>> > >>>>> Nope since you don't have to just test at protocol level but also on > >>>>> higher level, for instance check the full chain, up to servlet > >>>>> handling. > >>>>> > >>>>> > >>>>>> It's easy to simulate this behavior by sending a STOP signal to > >>>>>> Tomcat. > >>>>>> > >>>>>> I've also attached a log from mod_jk showing the problem. I marked > >>>>>> the > >>>>>> point at which processing in mod_jk stopped until I sent a CONT > >>>>>> signal to > >>>>>> tomcat. > >>>>>> > >>>>>> Does mod_jk2 have this same problem? Is there any interest in fixing > >>>>>> this? Does anyone have a workaround for this issue? > >>>>> > >>>>> > >>>>> Well, if you have a hung tomcat, you're probably allready in serious > >>>>> trouble. > >>>> > >> > >> > >> No, actually in my case I wasn't. I had two Tomcats running, as one was > >> prone to locking up due to a JVM or application bug. With a 50-50 load > >> distribution between two Tomcats, this left me with 1/2 of the requests > >> getting stuck and clients waiting forever and tying up Apache > >> processes. Eventually, a DOS will be the result if action is not taken > >> in time. If > >> mod_jk noticed it wasn't really alive, this wouldn't be an issue at all. > >> > >> > >>>>> Anyway, if we add stuff like time-out in ajp request, you could be > >>>>> stuck with long running servlets. Also jk read request in a blocking > >>>>> mode for performance and adding timeout here is not an option. > >>>> > >> > >> > >> Agreed that we wouldn't want a timeout normally to handle normal long > >> running servlet processes, but if there was a PING/PONG added to the > >> protocol there should be a timeout to prevent the above situation. > >> > >> > >>>> When I worked on ajp13++ (ajp14) protocol, I added a more secure auth > >>>> mecanism at connection time. > >>>> > >>>> Since there is a bidirectionnal communication, jk could detect that > >>>> even if the connection is open, the remote didn't respond and so fall > >>>> back to the next in cluster configuration. > >>>> > >>>> But on allready established connections, the problem persist. > >>>> > >>>> Or we should add a PING/PONG before sending any request to tomcat. > >>>> > >>>> It could be done as optional but I work on it only if many users make > >>>> such requirements > >>> > >>> > >>> if many users ask for such feature ;) > >> > >> > >> > >> Well, you've got one so far. ;-) Adding a configurable option to have > >> mod_jk verify (PING/PONG) that Tomcat is actually responding before using > >> the connection would solve the problem and I can't imagine that it would > >> add a lot of complexity to the code as well. If I wasn't so rusty > >> with my > >> C programming and had some spare time, I would offer to help code it > >> up. ;-) In any case, I'll be more than happy to help test. > > > > > > Well, if you could find more users or at least one tomcat commiter > > (Glenn, Remy, Costin, JFC...) who need it, I'll add the necessary code > > in java and C areas ;) > > > > > There may be a simple way to achieve what David is asking for without > setting a request timeout or implementing a PING/PONG between mod_jk > and Tomcat. > > What if each worker tracked the number of requests which were handled > by the worker since the last successful completion of a request. > > i.e. add the following to a worker > > worker->last_completed // Time in seconds since last successfully completed request > worker->requests_since_last_completed // Number of requests sent to worker > since last successful completion. > > Then logic could be added to try and detect an instance of Tomcat which has > failed. Perhaps even allow several additional worker properties to determine > when mod_jk should consider the worker failed. This won't work with the pre-fork MPM, since each Apache child will have its own idea of the timing. The only way that it could tell that a Tomcat failed is to try the request and fail :). > > The idea needs to be flushed out some more. But we should be able to track > enough data about how a worker is performing to make some simple decisions. > > Glenn > > ---------------------------------------------------------------------- > Glenn Nielsen [EMAIL PROTECTED] | /* Spelin donut madder | > MOREnet System Programming | * if iz ina coment. | > Missouri Research and Education Network | */ | > ---------------------------------------------------------------------- > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >
This message is intended only for the use of the person(s) listed above as the intended recipient(s), and may contain information that is PRIVILEGED and CONFIDENTIAL. If you are not an intended recipient, you may not read, copy, or distribute this message or any attachment. If you received this communication in error, please notify us immediately by e-mail and then delete all copies of this message and any attachments. In addition you should be aware that ordinary (unencrypted) e-mail sent through the Internet is not secure. Do not send confidential or sensitive information, such as social security numbers, account numbers, personal identification numbers and passwords, to us via ordinary (unencrypted) e-mail.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]