Re: mod_jk reply_timeout and error state

Rainer Jung Wed, 21 Apr 2010 03:27:55 -0700

Hi Sean,

On 20.04.2010 08:04, Sean GAO wrote:

According to online documentation
(http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html):
----
Long Garbage Collection pauses on the backend do not make a good fit
with some timeouts. Try to optimise your Java memory and GC settings.
----


So if JVM tuning doesn't help, what else could I do? Balancer worker's
method=Busyness setting may have some effect, but still this is
different. What do you think?

There is no kind of soft reply timeout which would mean a long responsetime indicates we shouldn't send more requests to that Tomcat but shouldstill wait for the outstanding responses.

The best you can do is tweaking the timeouts and the GC. Modern CMS GCdoesn't do stop-the-world, most of it runs concurrently. Yes, after sometime you might run into an occasional stop-the-world because offragmentation, but they will be much rarer than without CMS.

If your GC stop times are about 30 seconds, then that is not good, but Iwouldn't reduce a reply_timeout to something much smaller anyhows. Youdon't want to make the error detection very sensible, because then it isnot unlikely that you end up making your system more unstable thanwithout. You wan to detect serious problems and react on them but youshouldn't want to react quickly on any indication of possible problems.

What might help you a bit is the ability to define reply_timeoutdepending on the URL of the request. So if you know there are e.g. somereporting URLs that you know will take longet than a minute, you couldset a general reply_timeout to e.g. 30 seconds, and the timeout for thereport URLs to e.g. 2 minutes.


If you use reply_timeout, never forget to also add a max_reply_timeouts.

Concerning your configuration below, please do als consult

http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html

Regards,

Rainer

On Tue, Apr 20, 2010 at 12:48 PM, Sean GAO<gaoyuxi...@gmail.com>  wrote:

Hi,

We are running apache 2.2.4 and tomcat 5.5.28 with mod_jk 1.2.28. 3
tomcat instances.

Referring to http://tomcat.apache.org/connectors-doc/reference/workers.html
, we came up with a workers.properties file like this:

worker.list=balancer
worker.maintain=30
#tomcat01
worker.tomcat01.port=18009
worker.tomcat01.host=localhost
worker.tomcat01.type=ajp13
worker.tomcat01.lbfactor=120
worker.tomcat01.retries=2
worker.tomcat01.socket_timeout=30
worker.tomcat01.reply_timeout=30000
worker.tomcat01.recover_time=300
#tomcat02
worker.tomcat02.port=28009
worker.tomcat02.host=localhost
worker.tomcat02.type=ajp13
worker.tomcat02.lbfactor=100
worker.tomcat02.retries=2
worker.tomcat02.socket_timeout=30
worker.tomcat02.reply_timeout=30000
worker.tomcat02.recover_time=300
#tomcat03
worker.tomcat03.port=38009
worker.tomcat03.host=localhost
worker.tomcat03.type=ajp13
worker.tomcat03.lbfactor=0
worker.tomcat03.retries=2
#loadbalancer
worker.retries=2
worker.balancer.type=lb
worker.balancer.sticky_session=False
worker.balancer.method=Busyness
worker.balancer.balance_workers=tomcat01,tomcat02,tomcat03

So basically tomcat01 and tomcat02 are the main request handlers, with
tomcat03 acting as a backup server which is accessed only when both
tomcat01 and tomcat02 are in error state (30 seconds without response,
not necessarily mean offline). If something bad happens, e.g.
excessively long GC, or redeployment, we assume each failed tomcat
instance to get back to business in about 5 minutes.

This meets our needs to a certain degree. However, there's one thing
that bugs me:
If we set the reply_timeout too high, we miss the whole point of
fail-over. If we set the value too low, it's likely we are going to
kill a lot of legitimate/would-otherwise-success request, which is not
what we wanted either.

Instead of breaking the long request (say,>30 seconds) and put the
worker into "error" state, is there anyway, anyway at all, we can tell
mod_jk to mark a worker "busy", so that future requests are routed to
alternative workers? mok_jk can still check every 30 (or the default
60) seconds whether it is able to resume one of the "busy"-marked
workers, just like it does with the ones in "error" state.


Regards,
Sean


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: mod_jk reply_timeout and error state

Reply via email to