Hi Chris,
you are totally right. It was a little late yesterday :(
I updates the two patches. The only change is in jk_ajp_common.c. The
flax call inside ajp_next_connection is replaced by the normal one, so
no more change there and instead the normal one in ajp_reset_endpoint
now is a flex one.
I overlokked the obvious fact, that the client error is non recoverable,
so we don't close the connection in ajp_next_connection, but instead
immediately abort the service until later done is called and regularly
closes the connection in ajp_reset_enpoint.
The patch should be safe in the sense that I don't expect problems. It's
a little to early to guarantee inclusion in the regular distribution though.
Let us know, if you deploy it into production, and if it solves your
problem.
Regards,
Rainer
Chris Hut schrieb:
Hi Rainer, thanks so much for the reply, this is looks like just what we
need!
I tried your patch on 1.2.26 and unfortunately it did not quite work -
still got the 30 second wait. I dug a little deeper with some debugging
code and it looks like, in the case of a client abort, the code goes
through the old jk_shutdown_socket function (which forces linger to be
set to true) instead of the new jk_flex_shutdown_socket.
Here's my log output, my additions preceded with "=====":
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_process_callback::jk_ajp_common.c (1606): Writing to client aborted
or client network problems
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_service::jk_ajp_common.c (2191): (worker1) sending request to tomcat
failed (unrecoverable), because of client write error (attempt=1)
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
ajp_reset_endpoint::jk_ajp_common.c (695): ====== non-reusable endpoint,
calling jk_shutdown_socket
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
jk_shutdown_socket::jk_connect.c (636): ====== in jk_shutdown_socket,
calling jk_flex_shutdown_socket with linger set to TRUE
[Fri Jan 25 19:40:18.516 2008] [7419:1147140448] [info]
jk_flex_shutdown_socket::jk_connect.c (661): ====== linger is: 1
Experimenting some more, I found that if I changed line 695 in
jk_ajp_common.c from:
jk_shutdown_socket(ae->sd, l);
To:
jk_flex_shutdown_socket(ae->sd, ae->linger, l);
Then it works perfectly. But, I hesitate to deploy my own hacks to our
production systems so if you could give this a once-over (and perhaps an
updated patch file, if you have time) that would be fantastic.
Thanks again for the help!
Chris
-----Original Message-----
From: Rainer Jung [mailto:[EMAIL PROTECTED]
Sent: Friday, January 25, 2008 4:37 PM
To: Tomcat Users List
Subject: Re: Need *faster* connection abort with mod_jk
Hi Chris,
perfect, that makes sense! Yes, we are draining the backend connection
before closing it down, and since in your case the backend is happy to
go along and proceed streaming it's not just a question of capturing a
few additional bytes.
Mladen Turk added the connection draining a while ago, so we might need
to discuss the pros and cons. The URL you gave includes a nice
description. In our case there is no pipelining and once we finished
sending the request (including possible request body) to the backend,
closing without draining should be save.
I made a patch, that should disable connection draining exactly if we
got a write error when using the client connection. The patch is
available at
http://people.apache.org/~rjung/mod_jk-dev/patches/jk_close_immediate_on
_client_wr_error.patch
(for 1.2.27-dev) and
http://people.apache.org/~rjung/mod_jk-dev/patches/jk_close_immediate_on
_client_wr_error-1_2_26.patch
(for 1.2.26)
If you want to test it, I would suggest using 1.2.26, because 1.2.27-dev
is not well tested yet and there are quite some changes in it.
Have fun,
Rainer
Chris Hut schrieb:
Hi Rainer, thanks for the reply!
interesting use case :)
Just trying to keep things entertaining around here :)
are you writing something back during the wait time, or are you
simply
doing processing on the backend?
Yes, the long-running process (video rendering) is also streaming the
video bytes back to the client using outputStream.write(). It's this
write (ultimately, to an org.apache.catalina.connector.OutputBuffer)
that throws the ClientAbortException, if the client is actually gone.
So I figured out that JK is definitely holding open its connection
Tomcat meaning Tomcat does not know to abort. In desperation I
searched for the number "30" (hoping for a constant) in the mod_jk
source code, and found this in jk_connect.c:
#ifndef MAX_SECS_TO_LINGER
#define MAX_SECS_TO_LINGER 30
#endif
...
int jk_shutdown_socket(jk_sock_t s)
{
...
do {
/* Read all data from the peer until we reach "end-of-file"
* (FIN from peer) or we've exceeded our overall timeout. If
the
* backend does not send us bytes within 2 seconds
* (a value pulled from Apache 1.3 which seems to work well),
* close the connection.
*/
... /* [reads bytes] */
} while (difftime(time(NULL), start) < MAX_SECS_TO_LINGER); }
Sockets and low-level network programming aren't my strong suit, but
from searching around it sounds like this (lingering) is a common
practice to ensure proper TCP communication - I found a bit more info
here: http://httpd.apache.org/docs/2.0/misc/fin_wait_2.html#appendix
Tomcat's Http11Connector has a connectionLinger attribute (which
translates internally to a soLinger) which sounds like it does the
same thing - except that it's disabled by default.
So, does anybody know if there would be any detrimental impact to
re-compiling mod_jk with MAX_SECS_TO_LINGER set lower, say, 10 seconds
or 5? Or even lower?
Thanks again for the help!
Chris
-----Original Message-----
From: Rainer Jung [mailto:[EMAIL PROTECTED]
Sent: Friday, January 25, 2008 7:03 AM
To: Tomcat Users List
Subject: Re: Need *faster* connection abort with mod_jk
Hi Chris,
interesting use case :)
mod_jk closes the backend connection as soon as the reply_timeout
fires, or there is something to write back to the client and mod_jk
detects, that the connection to the client can not be used any longer
(browser stop, retry or click on another link).
If the user ends waiting for the reply and you don't try to write
something back, mod_jk won't detect that, because it sits there and
waits for something to come back from the backend. So to reliably
detect a browser stop, you need to actively use the connection. From
you comments about the mod_jk log file, it seems, that you are
actually doing this.
Why doesn't Tomcat immediately throw the exception: I guess (wild
guess), that it also only notices the closed mod_jk connection, if it
tries to use it. If you are actually using it continuously we would
have to investigate, why there is such a delay.
So first question to get closer would be: are you writing something
back during the wait time, or are you simply doing processing on the
backend?
In this case (because I don't understand the client abort detection of
mod_jk then): Can you reproduce the behaviour for a single request on
a test system using JkLogLevel debug?
Please make sure, that the clocks on the mod_jk system and on the
Tomcat system are in sync.
Regards,
Rainer
Chris Hut wrote:
Hi all,
We're using Apache 2.0.61 with mod_jk 1.2.25 and Tomcat 6.0.14.
We have a simple (non-load-balanced) apache/tomcat configuration
using
a single worker to forward requests from apache to tomcat.
(workers.properties is below)
Our problem is: Some client requests kick off an expensive,
long-running server-side process. Often, the client will give up
(e.g. the user will navigate to a different browser page) before
completion, and we want to cancel the server-side process early if
possible.
We use the ClientAbortException to easily set an "interrupted" flag
which our process monitors to see if it should abort. When
connecting
straight to the servlet using Tomcat only, this is very simple as the
exception is thrown immediately and the process dies right away.
This
is what we hope for.
When connecting via apache/mod_jk, though, it takes 30 seconds for
the
exception to be thrown in Tomcat. For efficientcy we'd love the
abort
to happen immediately if possible.
In the mod_jk.log file, we see this as soon as the client aborts
(e.g.
closes browser):
[Thu Jan 24 20:09:35.535 2008] [2011:1094711648] [info]
ajp_process_callback::jk_ajp_common.c (1511): Writing to client
aborted or client network problems [Thu Jan 24 20:09:35.535 2008]
[2011:1094711648] [info] ajp_service::jk_ajp_common.c (1996):
(worker1) request failed, because of client write error without
recovery in send loop attempt=0
But it takes 30 seconds to see:
[Thu Jan 24 20:10:05.641 2008] [2011:1094711648] [info]
jk_handler::mod_jk.c (2270): Aborting connection for worker=worker1
Which corresponds exactly to the time when the ClientAbortException
is
thrown in Tomcat.
Given the exact nature of the timing involved (30 seconds) I'm
guessing/hoping this is an apache and/or JK timeout setting; however,
I can't find a property that would do what we require which is just
to
kill the Tomcat connection faster if the end-user client closes the
connection on their side.
Can anybody point me to a setting to tweak? I did try using
recovery_options=4 (which says, "close the connection to Tomcat, if
we
detect an error when writing back the answer to the client
(browser)")
but the behavior is unchanged. I feel like changing the worker
timeouts is the wrong direction because the JK-to-Tomcat
communication
is working just fine, we just need a way to propagate JK's
client-abort error to Tomcat faster!
Thanks for your help!
workers.properties:
worker.list=worker1
worker.worker1.type=ajp13
worker.worker1.host=localhost
worker.worker1.port=8009
#worker.worker1.retries=4
Chris
---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]