Alright, I did some more testing with another application and found the following:
Sess Time (sec 10 0.101 125 0.101 500 0.201 1500 0.201 1800 0.101 2400 0.101 42,000 0.901 (that's not a typo) Turns out the application that was having trouble is storing a silly amount of crap in the session. I am still not 100% sure what's happening behind the scenes at the 1500 session mark, but I'm guessing that based on our session size (nearly 700 MB) and memory configuration we were hitting some heap ceiling and the replication was forced to 'juggle'. If anyone has any more background on what's happening feel free to set me straight. I'll check back later... I need to go beat some developers... Kyle Harper From: kharp...@oreillyauto.com To: "Tomcat Users List" <users@tomcat.apache.org> Date: 09/05/2012 07:55 PM Subject: Re: Tuning session replication on clusters I'm working with Lee on this as well, so I can help answer most of that. In short: Yes, all our replication is working well. We have keepalived acting as a vrrp device (no round-robin dns) in front of a few web servers (apache 2.2.x, mod_proxy/mod_ajp) which are using stickysessions and BalancerMembers. Replication (DeltaManager/SimpleTCPCluster) is working as intended on the tomcat side (6.0.24). After further research, the problem we're seeing is performance with replication when the number of sessions is larger than around 2000. Using Jmeter on our test servers I can reproduce the problem. Here are the times it takes to replicate X number of sessions when an application is restarted: Sess Time (sec) 10 0.101 125 0.401 500 1.302 1500 2.104 1800 5.308 1800 6.709 2400 15.02 3600 30.285 3600 27.238 The times make sense until around 1500. The time it takes to replicate more than 1500 sessions becomes exponentially worse. Here is our cluster configuration from "node1": <Engine name="Catalina" defaultHost="localhost" jvmRoute="tntest-app-a-1"> <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="8"> <Manager className="org.apache.catalina.ha.session.DeltaManager" stateTransferTimeout="45" expireSessionsOnShutdown="false" notifyListenersOnReplication="true" /> <Channel className="org.apache.catalina.tribes.group.GroupChannel"> <Membership className="org.apache.catalina.tribes.membership.McastService" address="239.255.0.1" port="45564" frequency="500" dropTime="3000" /> <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver" address="auto" port="4000" autoBind="100" selectorTimeout="5000" maxThreads="6" /> <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" timeout="45000" /> </Sender> <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/> </Channel> <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=""/> <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/> <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/> </Cluster> The best time we got for 3600 sessions was 24 seconds, and that's when I added the following to the Manager tag (stole this from the 5.5 docs; not even sure it's valid in 6.x): sendAllSessions="false" sendAllSessionsSize="500" sendAllSessionsWait="20" What has me stumped is why the time required to do more sessions is exponentially higher beyond 1500 sessions. Using JMeter I can simulate 3600 new users (all creating a session) and the two servers can serve the requests AND generate/replicate the sessions in under 19 seconds. Any ideas would be greatly appreciated. I have a full test environment to simulate anything you might recommend. Sincerely, Kyle Harper From: Igor Cicimov <icici...@gmail.com> To: Tomcat Users List <users@tomcat.apache.org> Date: 09/05/2012 07:12 PM Subject: Re: Tuning session replication on clusters On Thu, Sep 6, 2012 at 5:51 AM, <llow...@oreillyauto.com> wrote: > > I have a small cluster of 3 nodes running tomcat 6.0.24 with openJDK > 1.6.0_20 on Ubuntu 10.04 LTS. > > I have roughly 5,000-6,000 sessions at any given time, and when I restart > one of the nodes I am finding that not all sessions are getting > replicated , even when I have the state transfer timeout set to 60 > seconds. > > It seems that only sessions that have been touched recently are replicated, > even if the session is still otherwise valid. I did one test where I > created about 1,500 sessions and then took out one node, When I brought it > back online, it only replicated the 4-5 sessions that were from active > users on the test cluster. It did not replicated the idle sessions that > were still valid that my prior test had created. > > I am wanting to tune my settings, but I am unsure where would be the best > place to start. Should I start with the threads available to the NIO > Receiver, or would I be better off focusing on a different set of > attributes first, such as the send or receive timeout values? > > Any tips or pointers as to which setting might be the most productive would > be greatly appreciated. > > Lee Lowder > O'Reilly Auto Parts > Web Systems Administrator > (417) 862-2674 x1858 > > This communication and any attachments are confidential, protected by > Communications Privacy Act 18 USCS § 2510, solely for the use of the > intended recipient, and may contain legally privileged material. If you are > not the intended recipient, please return or destroy it immediately. Thank > you. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > For starter does your cluster satisfy the requirements bellow? To run session replication in your Tomcat 6.0 container, the following steps should be completed: - All your session attributes must implement java.io.Serializable - Uncomment the Cluster element in server.xml - If you have defined custom cluster valves, make sure you have the ReplicationValve defined as well under the Cluster element in server.xml - If your Tomcat instances are running on the same machine, make sure the tcpListenPort attribute is unique for each instance, in most cases Tomcat is smart enough to resolve this on it's own by autodetecting available ports in the range 4000-4100 - Make sure your web.xml has the <distributable/> element - If you are using mod_jk, make sure that jvmRoute attribute is set at your Engine <Engine name="Catalina" jvmRoute="node01" > and that the jvmRoute attribute value matches your worker name in workers.properties - Make sure that all nodes have the same time and sync with NTP service! - Make sure that your loadbalancer is configured for sticky session mode. Also you don't say what are you using for load balancing? Not bad to post your cluster definition as well. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. This communication and any attachments are confidential, protected by Communications Privacy Act 18 USCS § 2510, solely for the use of the intended recipient, and may contain legally privileged material. If you are not the intended recipient, please return or destroy it immediately. Thank you. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. This communication and any attachments are confidential, protected by Communications Privacy Act 18 USCS § 2510, solely for the use of the intended recipient, and may contain legally privileged material. If you are not the intended recipient, please return or destroy it immediately. Thank you. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org