Hey Yogesh,
please update to current svn head.
s. following bug that now fixed:
http://issues.apache.org/bugzilla/show_bug.cgi?id=37896
S. 5.5.14 Changelog the see that more bugs exists inside 5.5.12.
Please, report as it works!
Peter
Tipp: For high load the fastasyncqueue sender mode is better.
Also you don't need autoconnect!
Yogesh Prajapati schrieb:
The detail on Tomcat Clustering Load Testing Environment:
Application: A web Portal, Pure JSP/Servlet based implementation using
JDBC
(Oracle 10g RAC) and OLTP in nature.
Load Test Tool: Jmeter
Clustering Setup: 4 nodes
OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97)
Sofwares: JDK 1.5.0_05, Tomcat 5.5.12
Hardware configuration:
Node #1: Dual Pentium III (Coppermine) 1 GHz, 1 GB RAM
Node #2: Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
Node #3: Dual Pentium III (Coppermine) 1 GHz, 1 GB RAM
Node #4: Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
Network Configuration: All nodes are behind Alteon Load balancer
(response-time based load balancing), all have two nic cards with subnets
10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The
private
nic has multicast enabled. All private nic are connected to 10/100 Fast
Ethernet switch.
Tomcat cluster configuration (same on all nodes):
<Cluster className="
org.apache.catalina.cluster.tcp.SimpleTcpCluster
"
managerClassName="
org.apache.catalina.cluster.session.DeltaManager"
expireSessionsOnShutdown="false"
useDirtyFlag="true"
notifyListenersOnReplication="true">
<Membership
className="org.apache.catalina.cluster.mcast.McastService
"
mcastAddr="228.0.0.4"
mcastPort="45564"
mcastFrequency="1000"
mcastDropTime="35000"
mcastBindAddr="auto"
/>
<Receiver
className="
org.apache.catalina.cluster.tcp.ReplicationListener"
tcpListenAddress="auto"
tcpListenPort="4001"
tcpThreadCount="24"/>
<Sender
className="
org.apache.catalina.cluster.tcp.ReplicationTransmitter"
replicationMode="pooled"
autoConnect="true"
keepAliveTimeout="-1"
maxPoolSocketLimit="600"
doTransmitterProcessingStats="true"
/>
<Valve className="
org.apache.catalina.cluster.tcp.ReplicationValve"
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
<Deployer className="
org.apache.catalina.cluster.deploy.FarmWarDeployer"
tempDir="/tmp/war-temp/"
deployDir="/tmp/war-deploy/"
watchDir="/tmp/war-listen/"
watchEnabled="false"/>
<ClusterListener className="
org.apache.catalina.cluster.session.ClusterSessionListener"/>
</Cluster>
Note: for the application session availability on all the nodes is
must, so using "pooled" mode.
Tomcate VM Parameters (additional switches for VM tunning):
-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC
-XX:+PrintGCDetails
-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9
After starting tomcat on all the nodes, when I run Jmeter scripts with
20-70
concurrent user threads, the entire cluster works fine (almost 0% error)
but
at high number of users like > 200 concurrent user threads the tomcat
cluster session replication starts failing consistently and the
replication
messages getting lost. Here is what I get in tomcat logs on all the nodes
(too many times):
WARNING: Message lost: [10.1.11.95:4,001] type=[
org.apache.catalina.cluster.session.SessionMessageImpl],
id=[40FC741DB987BF5161C3AEEB32570A8E-
1134732225260]
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(
SocketOutputStream.java
:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
at org.apache.catalina.cluster.tcp.DataSender.writeData(
DataSender.java:858)
at org.apache.catalina.cluster.tcp.DataSender.pushMessage(
DataSender.java:799)
at org.apache.catalina.cluster.tcp.DataSender.sendMessage(
DataSender.java:623)
at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage
(
PooledSocketSender.java:128)
at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(
ReplicationTransmitter.java:867)
at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain
(ReplicationTransmitter.java:460)
at
org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(
SimpleTcpCluster.java:1012)
at org.apache.catalina.cluster.session.DeltaManager.send(
DeltaManager.java:629)
at
org.apache.catalina.cluster.session.DeltaManager.sendCreateSession(
DeltaManager.java:617)
at org.apache.catalina.cluster.session.DeltaManager.createSession
(
DeltaManager.java:593)
at org.apache.catalina.cluster.session.DeltaManager.createSession
(
DeltaManager.java:572)
.............................
.............................
Also I have noticed fewer times on two of the nodes (#3, #4) following
error:
SEVERE: TCP Worker thread in cluster caught '
java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel
java.lang.ArrayIndexOutOfBoundsException: 1025
at org.apache.catalina.cluster.io.XByteBuffer.toInt(
XByteBuffer.java
:231)
at org.apache.catalina.cluster.io.XByteBuffer.countPackages(
XByteBuffer.java:164)
at org.apache.catalina.cluster.io.ObjectReader.append(
ObjectReader.java:87)
at
org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel
(TcpReplicationThread.java:127)
at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(
TcpReplicationThread.java:69)
With all the above warning/exception I get the following jmeter results
(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up
period):
Rate: 28 req/sec
Error: 9.07 %
The rate is acceptable but error is very high and specially at high
number
of user thread the error % goes up. I have run the Jmeter script several
times along with tweaking cluster configuration but I am not able to
figure
out what am I doing wrong.
Is "Broken pipe" is some kind failure and serious blocker OR it can
safely
be ignored?
"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have
been
reported but I don't know yet?
With current scenario the memory usage are below 600 MB. My target is
reach
2000 concurrent users thread keeping error within 3% and maintain the
same
req/sec. Does this mean I have to add more memory (making it 2 GB on each
node).
Is there something else I am missing that I need to look at?
Any suggestions, ideas, tips are most welcome and appreciated.
Thanks
Yogi
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]