Re: lock wait endless in FastAsyncSocketSender

Ronald Klop Sat, 22 Mar 2008 06:01:53 -0700

On Fri Mar 21 15:40:32 CET 2008 Tomcat Users List <users@tomcat.apache.org> 
wrote:

Ronald Klop schrieb:
> On Fri Mar 21 14:46:03 CET 2008 Tomcat Users List> <users@tomcat.apache.org> wrote:
>> Hello Ronald,
>>
>> Ronald Klop schrieb:
>> > Hello,
>> > > I have this on one of my cluster nodes. Is this normal?
>> > We are running Tomcat 5.5.26 on linux 2.6.22.1. On all nodes netstat>> > gives full send and receive buffers on the tcp connections of the >>> replication.>> > > "Cluster-FastAsyncSocketSender-6" daemon prio=1 tid=0x08d16af0 >>> nid=0x78d7 in Object.wait() [0x766fe000..0x766ff140]
>> > at java.lang.Object.wait(Native Method)
>> > - waiting on <0x83e2a1b0> (a >>> org.apache.catalina.cluster.util.SingleRemoveSynchronizedAddLock)>> > at >>> org.apache.catalina.cluster.util.SingleRemoveSynchronizedAddLock.lockRemove(SingleRemoveSynchronizedAddLock.java:205)>> > > - locked <0x83e2a1b0> (a >>> org.apache.catalina.cluster.util.SingleRemoveSynchronizedAddLock)>> > at>> org.apache.catalina.cluster.util.FastQueue.remove(FastQueue.java:552)>> > at >>> org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.getQueuedMessage(FastAsyncSocketSender.java:506)>> > > at >>> org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.run(FastAsyncSocketSender.java:485)>> > > > netstat -n | grep 8015
>> > tcp 78829 46336 10.0.10.91:8015 10.0.10.94:37980 > ESTABLISHED
>> > tcp 36957 95568 10.0.10.91:53867 10.0.10.87:8015 > ESTABLISHED
>> > tcp 79063 46336 10.0.10.91:8015 10.0.10.88:55031 > ESTABLISHED
>> > tcp 36912 95568 10.0.10.91:34266 10.0.10.88:8015 > ESTABLISHED
>> > tcp 33282 0 10.0.10.91:34803 10.0.10.95:8015 > ESTABLISHED
>> > tcp 78290 46336 10.0.10.91:8015 10.0.10.95:60555 > ESTABLISHED
>> > tcp 78618 46336 10.0.10.91:8015 10.0.10.87:57796 > ESTABLISHED
>> > tcp 36930 95568 10.0.10.91:40632 10.0.10.94:8015 > ESTABLISHED
>> > > Any tips on how to debug/solve this?
>>
>> If you haven't set a removeWaitTimeout (which you shouldn't) the lock>> wait is interrupted every 30 seconds and started again. Waiting for>> the lock is usual unless the queue gets filled up. So if your>> observation indicates a problem or not should be debugged by looking>> at the queue size. You can get the queue size e.g. from the jmxproxy>> of the manager webapp>> (http://myserver:myport/manager/jmxproxy?qry=*:*) or the JConsole.>> Look for "FastAsyncSocketSender" and attribute queueSize. Usually it>> should be "0" most of the time.
>>
>> There are more statictics for the replication, so you should be able>> to see, if your replication actually is stuck, or if it only can't>> cope with the amount of replication needed.
>>
>> If not: do a couple of thread dumps (kill -QUIT) on one node with>> pauses of about 3 seconds in between. The results go to catalina.out.>> Then have a look, which threads actually hold the lock, the>> FastAsyncSocketSender is waiting for. Are they changing? If not, what>> are the threads holding the lock doing? You could also post the dumps>> or make them available to look at somewhere.
>>
>> > > Ronald.
>>
>> Regards,
>>
>> Rainer
>>
>>> Hi,>> I have queueSizes of about 25000 in jconsole. Not good I think.> Most cluster threads are waiting in socket.write()> (SocketReplicationThread.sendAck and> FastAsyncSocketSender$FastQueueThread.pushQueuedMessages). The tcp> buffers in linux are full. It looks like all my clusterthreads in tomcat> are writing, but nobody is reading.
Yes, 25000 is bad. Would you mind making the thread dumps available?
You should also post your cluster config block from server.xml.
Did you set waitForAck? Default should be "false" for Async, you cancheck in JConsole. Your thread observation seems to indicate that it's true?
Regards,

Rainer



I found my problem. Receiver.sendAck is true (by default?), but 
sender.waitForAck is false (by default?).
I set sender.waitForAck true and everything seems to work now. I understand ack 
should be false everywhere for fastasync, but at least it is working again 
without buffers filling up and I don't have that many changes in sessions that 
performance is affected by the ack. I'll change ack to false next week.

A nice feature request would be to let Tomcat auto-negotiate ack settings on 
startup of the replication.

My current server.xml is this. I just added the sender.waitForAck.

<Server port="8005" shutdown="SHUTDOWN">

<GlobalNamingResources>
<!-- Used by Manager webapp -->
<Resource name="UserDatabase" auth="Container"
type="org.apache.catalina.UserDatabase"
description="User database that can be updated and saved"
factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
pathname="conf/tomcat-users.xml" />
</GlobalNamingResources>

<Service name="Catalina">
<Connector port="8080" maxHttpHeaderSize="8192"
maxThreads="300" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="1024"
connectionTimeout="20000" disableUploadTimeout="true"
compression="on"
compressableMimeTypes="text/html,text/xml,text/plain,text/javascript,text/css"/>

<Engine name="Catalina" defaultHost="localhost">
<Realm className="org.apache.catalina.realm.UserDatabaseRealm"
resourceName="UserDatabase" />
<Host name="localhost"
appBase="/usr/local/crm-PREVIEW/deployed"
unpackWARs="true" autoDeploy="false" reloadable="false"
usePooling="false"
xmlValidation="false" xmlNamespaceAware="true">
<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
manager.className="org.apache.catalina.cluster.session.DeltaManager"
manager.stateTransferTimeout="60"
manager.sendAllSessions="false"
manager.sendAllSessionsSize="500"
manager.sendAllSessionsWaitTime="20"
service.mcastPort="48079"
sender.waitForAck="true" />
</Host>
</Engine>

</Service>
</Server>


Thanks for your help.

Ronald.

Re: lock wait endless in FastAsyncSocketSender

Reply via email to