I'm running 2 tomcat (5.0.18) instances with Simple TCP Clustering on the same Solaris 9 server, there are multiple web applications need to be clustered. When starting up the secondary tomcat, there is only one webapp receives the session state while all others wait 60 seconds and time out. The strange thing is which web app can successfully receive the session state seems to be random, I have seen different web app reports receiving session state, but never more than one of them at one time . Once tomcat is started, the session replication just works fine even for those application reports time out during startup. I can always kill the tomcat which is serving the user session, the other tomcat takes control without losing any user input. It's very annoying that because each instance time out 60s the secondary tomcat always takes a few minutes to start up. I have been looking into this for a whole week without any clue :( Any help is appreciated All web app needs to be clustered has <distributable /> in their web.xml Here's the cluster configuration in server.xml
<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster" name="FilipsCluster" debug="10" serviceclass="org.apache.catalina.cluster.mcast.McastService" mcastAddr="228.0.0.8 <http://228.0.0.8>" ##same mcastAddr on the other tomcat mcastPort="45565" ##same mcastPort on the other tomcat mcastFrequency="500" mcastDropTime="3000" tcpThreadCount="6" tcpListenAddress="auto" tcpListenPort="4001" ##Using port 4002 on the other tomcat since both instance on the same server tcpSelectorTimeout="100" printToScreen="false" expireSessionsOnShutdown="false" useDirtyFlag="true" replicationMode="pooled" /> The following is the start up log from catalina.out ... ( following seems telling multicast works fine, tomcat joins the cluster member ship ) INFO: Cluster is about to start Oct 24, 2005 5:01:57 PM org.apache.catalina.cluster.tcp.SimpleTcpClusterstart INFO: Sleeping for 2000 secs to establish cluster membership Oct 24, 2005 5:01:57 PM org.apache.catalina.cluster.tcp.SimpleTcpClustermemberAdded INFO: Replication member added:org.apache.catalina.cluster.mcast.McastMember [tcp://206.47.63.195:4001,206.47.63.195,4001, alive=38841] .... ( following is a webapp /comstg which fails to receive session state in 60s) INFO: Processing Context configuration file URL file:/app/ufs2/UfsServer/tomcat/conf/Catalina/localhost/comstg.xml Oct 24, 2005 5:02:09 PM org.apache.catalina.cluster.session.DeltaManagerstart INFO: Starting clustering manager...:/comstg Oct 24, 2005 5:02:09 PM org.apache.catalina.cluster.tcp.SimpleTcpClustermessageDataReceived WARNING: Context manager doesn't exist:/comqa Oct 24, 2005 5:02:09 PM org.apache.catalina.cluster.session.DeltaManagerstart WARNING: Manager[/comstg], requesting session state from org.apache.catalina.cluster.mcast.McastMember[ tcp://206.47.63.195:4001,206.47.63.195,4001, alive=38841]. This operation will timeout if no session state has been received within 60 seconds Oct 24, 2005 5:03:10 PM org.apache.catalina.cluster.session.DeltaManagerstart SEVERE: Manager[/comstg], No session state received, timing out. ... ( following is a webapp /comqa which successfully receives session state in 106ms) INFO: Processing Context configuration file URL file:/app/ufs2/UfsServer/tomcat/conf/Catalina/localhost/comqa.xml Oct 24, 2005 5:03:17 PM org.apache.catalina.cluster.session.DeltaManagerstart INFO: Starting clustering manager...:/comqa Oct 24, 2005 5:03:17 PM org.apache.catalina.cluster.session.DeltaManagerstart WARNING: Manager[/comqa], requesting session state from org.apache.catalina.cluster.mcast.McastMember[ tcp://206.47.63.195:4001,206.47.63.195,4001, alive=38841]. This operation will timeout if no session state has been received within 60 seconds Oct 24, 2005 5:03:17 PM org.apache.catalina.cluster.session.DeltaManagerstart INFO: Manager[/comqa], session state received in 106 ms.