I'm working with Lee on this as well, so I can help answer most of that.

In short:  Yes, all our replication is working well.  We have keepalived
acting as a vrrp device (no round-robin dns) in front of a few web servers
(apache 2.2.x, mod_proxy/mod_ajp) which are using stickysessions and
BalancerMembers.  Replication (DeltaManager/SimpleTCPCluster)  is working
as intended on the tomcat side (6.0.24).

After further research, the problem we're seeing is performance with
replication when the number of sessions is larger than around 2000.  Using
Jmeter on our test servers I can reproduce the problem.  Here are the times
it takes to replicate X number of sessions when an application is
restarted:
Sess   Time (sec)
10      0.101
125     0.401
500     1.302
1500    2.104
1800    5.308
1800    6.709
2400    15.02
3600    30.285
3600    27.238

The times make sense until around 1500.  The time it takes to replicate
more than 1500 sessions becomes exponentially worse.  Here is our cluster
configuration from "node1":
    <Engine name="Catalina" defaultHost="localhost"
jvmRoute="tntest-app-a-1">
      <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="8">
        <Manager className="org.apache.catalina.ha.session.DeltaManager"
                 stateTransferTimeout="45"
                 expireSessionsOnShutdown="false"
                 notifyListenersOnReplication="true" />
        <Channel className="org.apache.catalina.tribes.group.GroupChannel">
          <Membership
className="org.apache.catalina.tribes.membership.McastService"
                      address="239.255.0.1"
                      port="45564"
                      frequency="500"
                      dropTime="3000" />

          <Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                    address="auto"
                    port="4000"
                    autoBind="100"
                    selectorTimeout="5000"
                    maxThreads="6" />

          <Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
            <Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="45000" />
          </Sender>

          <Interceptor
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
          <Interceptor
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
        </Channel>

        <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=""/>
        <Valve
className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

        <ClusterListener
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
        <ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
      </Cluster>


The best time we got for 3600 sessions was 24 seconds, and that's when I
added the following to the Manager tag (stole this from the 5.5 docs; not
even sure it's valid in 6.x):
                 sendAllSessions="false"
                 sendAllSessionsSize="500"
                 sendAllSessionsWait="20"


What has me stumped is why the time required to do more sessions is
exponentially higher beyond 1500 sessions.  Using JMeter I can simulate
3600 new users (all creating a session) and the two servers can serve the
requests AND generate/replicate the sessions in under 19 seconds.  Any
ideas would be greatly appreciated.  I have a full test environment to
simulate anything you might recommend.

Sincerely,
Kyle Harper





From:   Igor Cicimov <icici...@gmail.com>
To:     Tomcat Users List <users@tomcat.apache.org>
Date:   09/05/2012 07:12 PM
Subject:        Re: Tuning session replication on clusters



On Thu, Sep 6, 2012 at 5:51 AM, <llow...@oreillyauto.com> wrote:

>
> I have a small cluster of 3 nodes running tomcat 6.0.24 with openJDK
> 1.6.0_20 on Ubuntu 10.04 LTS.
>
> I have roughly 5,000-6,000 sessions at any given time, and when I restart
> one of the nodes I am finding that not all sessions are getting
> replicated , even when I have the state transfer  timeout set to 60
> seconds.
>
> It seems that only sessions that have been touched recently are
replicated,
> even if the session is still otherwise valid. I did one test where I
> created about 1,500 sessions and then took out one node, When I brought
it
> back online, it only replicated the 4-5 sessions that were from active
> users on the test cluster. It did not replicated the idle sessions that
> were still valid that my prior test had created.
>
> I  am wanting to tune my settings, but I am unsure where would be the
best
> place to start. Should I start with the threads available to the NIO
> Receiver, or would I be better off focusing on a different set of
> attributes first, such as the send or receive timeout values?
>
> Any tips or pointers as to which setting might be the most productive
would
> be greatly appreciated.
>
> Lee Lowder
> O'Reilly Auto Parts
> Web Systems Administrator
> (417) 862-2674 x1858
>
> This communication and any attachments are confidential, protected by
> Communications Privacy Act 18 USCS § 2510, solely for the use of the
> intended recipient, and may contain legally privileged material. If you
are
> not the intended recipient, please return or destroy it immediately.
Thank
> you.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>
For starter does your cluster satisfy the requirements bellow?

To run session replication in your Tomcat 6.0 container, the following
steps should be completed:

   - All your session attributes must implement java.io.Serializable
   - Uncomment the Cluster element in server.xml
   - If you have defined custom cluster valves, make sure you have the
   ReplicationValve defined as well under the Cluster element in server.xml
   - If your Tomcat instances are running on the same machine, make sure
   the tcpListenPort attribute is unique for each instance, in most cases
   Tomcat is smart enough to resolve this on it's own by autodetecting
   available ports in the range 4000-4100
   - Make sure your web.xml has the <distributable/> element
   - If you are using mod_jk, make sure that jvmRoute attribute is set at
   your Engine <Engine name="Catalina" jvmRoute="node01" > and that the
   jvmRoute attribute value matches your worker name in workers.properties
   - Make sure that all nodes have the same time and sync with NTP service!
   - Make sure that your loadbalancer is configured for sticky session
mode.


Also you don't say what are you using for load balancing? Not bad to post
your cluster definition as well.

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



This communication and any attachments are confidential, protected by 
Communications Privacy Act 18 USCS § 2510, solely for the use of the intended 
recipient, and may contain legally privileged material. If you are not the 
intended recipient, please return or destroy it immediately. Thank you.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to