Re: Need faster connection abort with mod_jk

Rainer Jung Sat, 26 Jan 2008 05:19:41 -0800

Hi Chris,

you are totally right. It was a little late yesterday :(

I updates the two patches. The only change is in jk_ajp_common.c. Theflax call inside ajp_next_connection is replaced by the normal one, sono more change there and instead the normal one in ajp_reset_endpointnow is a flex one.

I overlokked the obvious fact, that the client error is non recoverable,so we don't close the connection in ajp_next_connection, but insteadimmediately abort the service until later done is called and regularlycloses the connection in ajp_reset_enpoint.

The patch should be safe in the sense that I don't expect problems. It'sa little to early to guarantee inclusion in the regular distribution though.

Let us know, if you deploy it into production, and if it solves yourproblem.


Regards,

Rainer

Chris Hut schrieb:

Hi Rainer, thanks so much for the reply, this is looks like just what we
need!

I tried your patch on 1.2.26 and unfortunately it did not quite work -
still got the 30 second wait.  I dug a little deeper with some debugging
code and it looks like, in the case of a client abort, the code goes
through the old jk_shutdown_socket function (which forces linger to be
set to true) instead of the new jk_flex_shutdown_socket.

Here's my log output, my additions preceded with "=====":

[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_process_callback::jk_ajp_common.c (1606): Writing to client aborted
or client network problems
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_service::jk_ajp_common.c (2191): (worker1) sending request to tomcat
failed (unrecoverable), because of client write error (attempt=1)
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_reset_endpoint::jk_ajp_common.c (695): ====== non-reusable endpoint,
calling jk_shutdown_socket
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
jk_shutdown_socket::jk_connect.c (636): ====== in jk_shutdown_socket,
calling jk_flex_shutdown_socket with linger set to TRUE
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
jk_flex_shutdown_socket::jk_connect.c (661): ====== linger is: 1

Experimenting some more, I found that if I changed line 695 in
jk_ajp_common.c from:

jk_shutdown_socket(ae->sd, l);

To:

jk_flex_shutdown_socket(ae->sd, ae->linger, l);

Then it works perfectly.  But, I hesitate to deploy my own hacks to our
production systems so if you could give this a once-over (and perhaps an
updated patch file, if you have time) that would be fantastic.

Thanks again for the help!

Chris

-----Original Message-----
From: Rainer Jung [mailto:[EMAIL PROTECTED]Sent: Friday, January 25, 2008 4:37 PM
To: Tomcat Users List
Subject: Re: Need *faster* connection abort with mod_jk

Hi Chris,

perfect, that makes sense! Yes, we are draining the backend connection
before closing it down, and since in your case the backend is happy to
go along and proceed streaming it's not just a question of capturing a
few additional bytes.

Mladen Turk added the connection draining a while ago, so we might need
to discuss the pros and cons. The URL you gave includes a nice
description. In our case there is no pipelining and once we finished
sending the request (including possible request body) to the backend,
closing without draining should be save.

I made a patch, that should disable connection draining exactly if we
got a write error when using the client connection. The patch is
available at

http://people.apache.org/~rjung/mod_jk-dev/patches/jk_close_immediate_on
_client_wr_error.patch

(for 1.2.27-dev) and

http://people.apache.org/~rjung/mod_jk-dev/patches/jk_close_immediate_on
_client_wr_error-1_2_26.patch

(for 1.2.26)

If you want to test it, I would suggest using 1.2.26, because 1.2.27-dev
is not well tested yet and there are quite some changes in it.

Have fun,

Rainer

Chris Hut schrieb:
Hi Rainer, thanks for the reply!
interesting use case :)
Just trying to keep things entertaining around here :)
are you writing something back during the wait time, or are yousimply
doing processing on the backend?
Yes, the long-running process (video rendering) is also streaming thevideo bytes back to the client using outputStream.write(). It's thiswrite (ultimately, to an org.apache.catalina.connector.OutputBuffer)
that throws the ClientAbortException, if the client is actually gone.
So I figured out that JK is definitely holding open its connectionTomcat meaning Tomcat does not know to abort. In desperation Isearched for the number "30" (hoping for a constant) in the mod_jksource code, and found this in jk_connect.c:
#ifndef MAX_SECS_TO_LINGER
#define MAX_SECS_TO_LINGER 30
#endif
...
int jk_shutdown_socket(jk_sock_t s)
{
    ...
    do {
        /* Read all data from the peer until we reach "end-of-file"
         * (FIN from peer) or we've exceeded our overall timeout. If
the
         * backend does not send us bytes within 2 seconds
         * (a value pulled from Apache 1.3 which seems to work well),
         * close the connection.
         */
        ... /* [reads bytes] */
    } while (difftime(time(NULL), start) < MAX_SECS_TO_LINGER); }
Sockets and low-level network programming aren't my strong suit, butfrom searching around it sounds like this (lingering) is a commonpractice to ensure proper TCP communication - I found a bit more info
here: http://httpd.apache.org/docs/2.0/misc/fin_wait_2.html#appendix
Tomcat's Http11Connector has a connectionLinger attribute (whichtranslates internally to a soLinger) which sounds like it does thesame thing - except that it's disabled by default.
So, does anybody know if there would be any detrimental impact tore-compiling mod_jk with MAX_SECS_TO_LINGER set lower, say, 10 seconds
or 5?  Or even lower?

Thanks again for the help!

Chris


-----Original Message-----
From: Rainer Jung [mailto:[EMAIL PROTECTED]
Sent: Friday, January 25, 2008 7:03 AM
To: Tomcat Users List
Subject: Re: Need *faster* connection abort with mod_jk

Hi Chris,

interesting use case :)
mod_jk closes the backend connection as soon as the reply_timeoutfires, or there is something to write back to the client and mod_jkdetects, that the connection to the client can not be used any longer(browser stop, retry or click on another link).
If the user ends waiting for the reply and you don't try to writesomething back, mod_jk won't detect that, because it sits there andwaits for something to come back from the backend. So to reliablydetect a browser stop, you need to actively use the connection. Fromyou comments about the mod_jk log file, it seems, that you areactually doing this.
Why doesn't Tomcat immediately throw the exception: I guess (wildguess), that it also only notices the closed mod_jk connection, if ittries to use it. If you are actually using it continuously we wouldhave to investigate, why there is such a delay.
So first question to get closer would be: are you writing somethingback during the wait time, or are you simply doing processing on the
backend?
In this case (because I don't understand the client abort detection of
mod_jk then): Can you reproduce the behaviour for a single request ona test system using JkLogLevel debug?
Please make sure, that the clocks on the mod_jk system and on theTomcat system are in sync.
Regards,

Rainer

Chris Hut wrote:
Hi all,

We're using Apache 2.0.61 with mod_jk 1.2.25 and Tomcat 6.0.14.
We have a simple (non-load-balanced) apache/tomcat configurationusing
a single worker to forward requests from apache to tomcat.
(workers.properties is below)
Our problem is: Some client requests kick off an expensive,long-running server-side process. Often, the client will give up(e.g. the user will navigate to a different browser page) beforecompletion, and we want to cancel the server-side process early if
possible.
We use the ClientAbortException to easily set an "interrupted" flagwhich our process monitors to see if it should abort. Whenconnecting
straight to the servlet using Tomcat only, this is very simple as the
exception is thrown immediately and the process dies right away.This
is what we hope for.
When connecting via apache/mod_jk, though, it takes 30 seconds fortheexception to be thrown in Tomcat. For efficientcy we'd love theabort
to happen immediately if possible.

In the mod_jk.log file, we see this as soon as the client aborts
(e.g.
closes browser):
[Thu Jan 24 20:09:35.535 2008] [2011:1094711648] [info]ajp_process_callback::jk_ajp_common.c (1511): Writing to clientaborted or client network problems [Thu Jan 24 20:09:35.535 2008][2011:1094711648] [info] ajp_service::jk_ajp_common.c (1996):(worker1) request failed, because of client write error withoutrecovery in send loop attempt=0
But it takes 30 seconds to see:
[Thu Jan 24 20:10:05.641 2008] [2011:1094711648] [info]jk_handler::mod_jk.c (2270): Aborting connection for worker=worker1
Which corresponds exactly to the time when the ClientAbortExceptionis
thrown in Tomcat.
Given the exact nature of the timing involved (30 seconds) I'mguessing/hoping this is an apache and/or JK timeout setting; however,
I can't find a property that would do what we require which is justtokill the Tomcat connection faster if the end-user client closes theconnection on their side.
Can anybody point me to a setting to tweak?  I did try using
recovery_options=4 (which says, "close the connection to Tomcat, ifwedetect an error when writing back the answer to the client(browser)")but the behavior is unchanged. I feel like changing the workertimeouts is the wrong direction because the JK-to-Tomcatcommunicationis working just fine, we just need a way to propagate JK'sclient-abort error to Tomcat faster!
Thanks for your help!

workers.properties:

worker.list=worker1
worker.worker1.type=ajp13
worker.worker1.host=localhost
worker.worker1.port=8009
#worker.worker1.retries=4

Chris


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Need *faster* connection abort with mod_jk

Reply via email to

Re: Need faster connection abort with mod_jk