We have been trying to troubleshoot an issue with our production environment
in which our website become completely unresponsive.  First our environment:

Windows 2003 server Sp2 (multiple servers, happens on all of them at
different times)
Java JDK 1.5.0_06
Tomcat 5.5.8
mod_jk 1.2.15
Apache 1.3.34/Mod_SSL

Each server is using ajp/1.3 to connect apache to tomcat on 127.0.0.1

We are/have experiencing/ed two issues.  The first is a 'Server Busy Page'
being returned when we are under any load.  We traced this to what we are
99.99% sure is a mis-statement in the worker.properties documentation.

"Do not use cachesize with values higher then 1 on Apache 2.x prefork or
Apache 1.3.x!"

I believe that this only applies to Non-Windows installations.  (correct me
if I am wrong here) Apache 1.3.34 on Windows is multi-threaded and you must
set the cachesize = webserver child processes.  This change resolved the
'Server Busy Page' issue.  I believe that by only having 1 thread
essentially causes it to run in 

The real problem we are having is what appears to be hanging connections
between mod_jk and tomcat.  The symptom that we see is that the site becomes
unresponsive.  If we look in the Tomcat Manager at the 'server status' page
when this is happening we find numerous connections in the jk-127.0.0.1-8009
section that do not seem to be finishing.  The site seems to stop responding
when the number of 'hung' connections = maxthreads as set in server.xml.  Of
course changing the maxthreads lets us go longer without losing tomcat
because there are more threads available.

Sometimes a single Tomcat thread will hang as indicated below (this is from
one that is currently hung):
Stage: S
Time: 4644016
B Sent: 632 KB
B Received: 0 KB
Client: [client IP]
Vhost: [vhost of website]
Page: GET [page from server]

Notice the time on this.  I am not sure why this thread is not finishing,
this is a page that usually takes about 100ms to return the contents.
Checking the threads through jconsole I find the following for this thread
(I am pretty sure that this is the correct thread):

Name: TP-Processor23
State: RUNNABLE
Total blocked: 2,219  Total waited: 57,956

Stack trace: 
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
org.apache.jk.common.ChannelSocket.send(ChannelSocket.java:506)
org.apache.jk.server.JkCoyoteHandler.doWrite(JkCoyoteHandler.java:260)
org.apache.coyote.Response.doWrite(Response.java:551)
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:
361)
org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:403)
org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:323)
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:392)
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:381)
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.ja
va:76)
com.seccas.servlet.GetMessagePartServlet.execute(GetMessagePartServlet.java:
226)
com.seccas.servlet.GetMessagePartServlet.doGet(GetMessagePartServlet.java:31
)
javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
FilterChain.java:252)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
ain.java:173)
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
va:214)
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
va:178)
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126
)
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105
)
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
:107)
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:306)
org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:385)
org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:745)
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:675)
org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:868)
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.jav
a:684)
java.lang.Thread.run(Thread.java:595)

Interestingly, if I restart Apache, the thread kills itself and everything
seems to be OK... So this makes me think this is an issue with mod_jk
speaking to tomcat, perhaps an ajp issue?

I had the same issue on another server (same environment), but it seemed to
block the whole server.  All the processes on this server were blocked by:

Name: TP-Processor20
State: RUNNABLE
Total blocked: 529  Total waited: 6,578

Stack trace: 
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.read(SocketInputStream.java:129)
java.io.DataInputStream.readFully(DataInputStream.java:176)
java.io.DataInputStream.readFully(DataInputStream.java:152)
net.sourceforge.jtds.jdbc.SharedSocket.readPacket(SharedSocket.java:826)
net.sourceforge.jtds.jdbc.SharedSocket.getNetPacket(SharedSocket.java:707)
net.sourceforge.jtds.jdbc.ResponseStream.getPacket(ResponseStream.java:466)
net.sourceforge.jtds.jdbc.ResponseStream.read(ResponseStream.java:103)
net.sourceforge.jtds.jdbc.ResponseStream.peek(ResponseStream.java:88)
net.sourceforge.jtds.jdbc.TdsCore.wait(TdsCore.java:3870)
net.sourceforge.jtds.jdbc.TdsCore.executeSQL(TdsCore.java:1042)
net.sourceforge.jtds.jdbc.JtdsStatement.executeSQL(JtdsStatement.java:478)
net.sourceforge.jtds.jdbc.JtdsPreparedStatement.execute(JtdsPreparedStatemen
t.java:478)
org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.execute(DelegatingPr
eparedStatement.java:168)
...<SNIP>

In both cases, restarting apache resolves the issue.

Any thoughts that anyone has would be greatly appreciated.  I am running out
of things to try on the troubleshooting side.  We are also pursuing whether
our code could somehow be contibuting to this... Some sort of session
synchronization issue or something.

Thanks,
Justin

--
Justin Greene
SECCAS, LLC.
212-242-9308 x 101 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to