Hi Chris,

you are totally right. It was a little late yesterday :(

I updates the two patches. The only change is in jk_ajp_common.c. The flax call inside ajp_next_connection is replaced by the normal one, so no more change there and instead the normal one in ajp_reset_endpoint now is a flex one.

I overlokked the obvious fact, that the client error is non recoverable, so we don't close the connection in ajp_next_connection, but instead immediately abort the service until later done is called and regularly closes the connection in ajp_reset_enpoint.

The patch should be safe in the sense that I don't expect problems. It's a little to early to guarantee inclusion in the regular distribution though.

Let us know, if you deploy it into production, and if it solves your problem.

Regards,

Rainer

Chris Hut schrieb:
Hi Rainer, thanks so much for the reply, this is looks like just what we
need!

I tried your patch on 1.2.26 and unfortunately it did not quite work -
still got the 30 second wait.  I dug a little deeper with some debugging
code and it looks like, in the case of a client abort, the code goes
through the old jk_shutdown_socket function (which forces linger to be
set to true) instead of the new jk_flex_shutdown_socket.

Here's my log output, my additions preceded with "=====":

[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_process_callback::jk_ajp_common.c (1606): Writing to client aborted
or client network problems
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_service::jk_ajp_common.c (2191): (worker1) sending request to tomcat
failed (unrecoverable), because of client write error (attempt=1)
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_reset_endpoint::jk_ajp_common.c (695): ====== non-reusable endpoint,
calling jk_shutdown_socket
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
jk_shutdown_socket::jk_connect.c (636): ====== in jk_shutdown_socket,
calling jk_flex_shutdown_socket with linger set to TRUE
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
jk_flex_shutdown_socket::jk_connect.c (661): ====== linger is: 1

Experimenting some more, I found that if I changed line 695 in
jk_ajp_common.c from:

jk_shutdown_socket(ae->sd, l);

To:

jk_flex_shutdown_socket(ae->sd, ae->linger, l);

Then it works perfectly.  But, I hesitate to deploy my own hacks to our
production systems so if you could give this a once-over (and perhaps an
updated patch file, if you have time) that would be fantastic.

Thanks again for the help!

Chris

-----Original Message-----
From: Rainer Jung [mailto:[EMAIL PROTECTED] Sent: Friday, January 25, 2008 4:37 PM
To: Tomcat Users List
Subject: Re: Need *faster* connection abort with mod_jk

Hi Chris,

perfect, that makes sense! Yes, we are draining the backend connection
before closing it down, and since in your case the backend is happy to
go along and proceed streaming it's not just a question of capturing a
few additional bytes.

Mladen Turk added the connection draining a while ago, so we might need
to discuss the pros and cons. The URL you gave includes a nice
description. In our case there is no pipelining and once we finished
sending the request (including possible request body) to the backend,
closing without draining should be save.

I made a patch, that should disable connection draining exactly if we
got a write error when using the client connection. The patch is
available at

http://people.apache.org/~rjung/mod_jk-dev/patches/jk_close_immediate_on
_client_wr_error.patch

(for 1.2.27-dev) and

http://people.apache.org/~rjung/mod_jk-dev/patches/jk_close_immediate_on
_client_wr_error-1_2_26.patch

(for 1.2.26)

If you want to test it, I would suggest using 1.2.26, because 1.2.27-dev
is not well tested yet and there are quite some changes in it.

Have fun,

Rainer

Chris Hut schrieb:
Hi Rainer, thanks for the reply!

interesting use case :)
Just trying to keep things entertaining around here :)

are you writing something back during the wait time, or are you simply
doing processing on the backend?

Yes, the long-running process (video rendering) is also streaming the video bytes back to the client using outputStream.write(). It's this write (ultimately, to an org.apache.catalina.connector.OutputBuffer)
that throws the ClientAbortException, if the client is actually gone.

So I figured out that JK is definitely holding open its connection Tomcat meaning Tomcat does not know to abort. In desperation I searched for the number "30" (hoping for a constant) in the mod_jk source code, and found this in jk_connect.c:

#ifndef MAX_SECS_TO_LINGER
#define MAX_SECS_TO_LINGER 30
#endif
...
int jk_shutdown_socket(jk_sock_t s)
{
    ...
    do {
        /* Read all data from the peer until we reach "end-of-file"
         * (FIN from peer) or we've exceeded our overall timeout. If
the
         * backend does not send us bytes within 2 seconds
         * (a value pulled from Apache 1.3 which seems to work well),
         * close the connection.
         */
        ... /* [reads bytes] */
    } while (difftime(time(NULL), start) < MAX_SECS_TO_LINGER); }

Sockets and low-level network programming aren't my strong suit, but from searching around it sounds like this (lingering) is a common practice to ensure proper TCP communication - I found a bit more info
here: http://httpd.apache.org/docs/2.0/misc/fin_wait_2.html#appendix

Tomcat's Http11Connector has a connectionLinger attribute (which translates internally to a soLinger) which sounds like it does the same thing - except that it's disabled by default.

So, does anybody know if there would be any detrimental impact to re-compiling mod_jk with MAX_SECS_TO_LINGER set lower, say, 10 seconds

or 5?  Or even lower?

Thanks again for the help!

Chris


-----Original Message-----
From: Rainer Jung [mailto:[EMAIL PROTECTED]
Sent: Friday, January 25, 2008 7:03 AM
To: Tomcat Users List
Subject: Re: Need *faster* connection abort with mod_jk

Hi Chris,

interesting use case :)

mod_jk closes the backend connection as soon as the reply_timeout fires, or there is something to write back to the client and mod_jk detects, that the connection to the client can not be used any longer (browser stop, retry or click on another link).

If the user ends waiting for the reply and you don't try to write something back, mod_jk won't detect that, because it sits there and waits for something to come back from the backend. So to reliably detect a browser stop, you need to actively use the connection. From you comments about the mod_jk log file, it seems, that you are actually doing this.

Why doesn't Tomcat immediately throw the exception: I guess (wild guess), that it also only notices the closed mod_jk connection, if it tries to use it. If you are actually using it continuously we would have to investigate, why there is such a delay.

So first question to get closer would be: are you writing something back during the wait time, or are you simply doing processing on the
backend?
In this case (because I don't understand the client abort detection of

mod_jk then): Can you reproduce the behaviour for a single request on a test system using JkLogLevel debug?

Please make sure, that the clocks on the mod_jk system and on the Tomcat system are in sync.

Regards,

Rainer

Chris Hut wrote:
Hi all,

We're using Apache 2.0.61 with mod_jk 1.2.25 and Tomcat 6.0.14.

We have a simple (non-load-balanced) apache/tomcat configuration using
a single worker to forward requests from apache to tomcat.
(workers.properties is below)

Our problem is: Some client requests kick off an expensive, long-running server-side process. Often, the client will give up (e.g. the user will navigate to a different browser page) before completion, and we want to cancel the server-side process early if
possible.
We use the ClientAbortException to easily set an "interrupted" flag which our process monitors to see if it should abort. When connecting
straight to the servlet using Tomcat only, this is very simple as the

exception is thrown immediately and the process dies right away. This
is what we hope for.

When connecting via apache/mod_jk, though, it takes 30 seconds for the exception to be thrown in Tomcat. For efficientcy we'd love the abort
to happen immediately if possible.

In the mod_jk.log file, we see this as soon as the client aborts
(e.g.
closes browser):

[Thu Jan 24 20:09:35.535 2008] [2011:1094711648] [info] ajp_process_callback::jk_ajp_common.c (1511): Writing to client aborted or client network problems [Thu Jan 24 20:09:35.535 2008] [2011:1094711648] [info] ajp_service::jk_ajp_common.c (1996): (worker1) request failed, because of client write error without recovery in send loop attempt=0

But it takes 30 seconds to see:

[Thu Jan 24 20:10:05.641 2008] [2011:1094711648] [info] jk_handler::mod_jk.c (2270): Aborting connection for worker=worker1

Which corresponds exactly to the time when the ClientAbortException is
thrown in Tomcat.

Given the exact nature of the timing involved (30 seconds) I'm guessing/hoping this is an apache and/or JK timeout setting; however,

I can't find a property that would do what we require which is just to kill the Tomcat connection faster if the end-user client closes the connection on their side.

Can anybody point me to a setting to tweak?  I did try using
recovery_options=4 (which says, "close the connection to Tomcat, if we detect an error when writing back the answer to the client (browser)") but the behavior is unchanged. I feel like changing the worker timeouts is the wrong direction because the JK-to-Tomcat communication is working just fine, we just need a way to propagate JK's client-abort error to Tomcat faster!

Thanks for your help!

workers.properties:

worker.list=worker1
worker.worker1.type=ajp13
worker.worker1.host=localhost
worker.worker1.port=8009
#worker.worker1.retries=4

Chris

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to