Hey, thanks for your fixes, sadly it didn't fix my problem. That is if I am doing things right, which there are no guarantees for. I will continue to hunt down the problem(s).
If I configure my cluster as you suggested in your first mail, the Exceptions disappears, if I use my original config it is as it were. I would believe that a valid configuration shouldn't make tomcat throw severe exceptions... I also started experiencing shutdown exceptions from the svn version. 5.5.12 didn't have these. (stacktrace can be found in bottom of mail). Are there any gotchas for using waitForAck="true" and keepAliveTimeout="-1"? For example less performance? I guess that waiting for acks from the whole cluster can be lengthy, especially if we want to use pooled replication? You also said in an earlier mail that I didn't need both the ClusterSessionListener, and the JvmRouteSessionIDBinderListener, why is that? And which do I need? And why are both listed in the example set-up in clusterdocs, if only one is needed? Now for the problem (sorry for the length of this mail). My setup is: The latest sources (as of 08:30 this morning, GMT+1 - 15/11-05) I have split tomcat in CATALINA_HOME (the compiled one) and two CATALINA_BASE: cluster1, and cluster2, jvmroute set accordingly. The CATALINA_BASE folders have webapps/servlet-examples from 5.5.12, where I have modified the session example to be distributable. For first tests I use this configuration: <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster" managerClassName="org.apache.catalina.cluster.session.DeltaManager" doClusterLog="true" clusterLogName="clusterlog" expireSessionsOnShutdown="false" useDirtyFlag="true" notifyListenersOnReplication="true" notifySessionListenersOnReplication="true"> <Membership className="org.apache.catalina.cluster.mcast.McastService" mcastAddr="228.0.0.4" mcastPort="45564" mcastFrequency="500" mcastDropTime="3000"/> <Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener" tcpListenAddress="auto" tcpListenPort="4001" tcpSelectorTimeout="100" tcpThreadCount="2"/> <Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter" replicationMode="fastasyncqueue" ackTimeout="15000"/> <Valve className="org.apache.catalina.cluster.tcp.ReplicationValve" filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.* \.txt;"/> <Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer" tempDir="${catalina.base}/war-temp" deployDir="${catalina.base}/webapps/" watchDir="${catalina.base}/war-listen/" watchEnabled="false"/> <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/> <Valve className="org.apache.catalina.cluster.session.JvmRouteBinderValve" enabled="true" /> <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener" /> <ClusterListener className="org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener" /> </Cluster> When setting value 'asdf' in session in cluster1, and then switching to see the same servlet in cluster2, the replicated values appear ok, but if I wait some time (not doing any requests), I get the exceptions: ==> cluster1/logs/catalina.out <== Nov 15, 2005 9:35:42 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 15, 2005:9:35:42 AM 1 - BA25CFCE25F1E939EEF6F559D0A721B1.cluster1-1132043742511 ==> cluster1/logs/localhost.2005-11-15.log <== Nov 15, 2005 9:35:42 AM org.apache.catalina.core.ApplicationContext log INFO: SessionListener: attributeAdded('BA25CFCE25F1E939EEF6F559D0A721B1.cluster1', 'asdf', '') Nov 15, 2005 9:35:42 AM org.apache.catalina.core.ApplicationContext log INFO: InvokerFilter(ApplicationFilterConfig[name=Path Mapped Filter, filterClass=filters.ExampleFilter]): 7 milliseconds ==> cluster2/logs/catalina.out <== Nov 15, 2005 9:35:42 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 15, 2005:9:35:42 AM 15 127.0.0.1:4,001 BA25CFCE25F1E939EEF6F559D0A721B1.cluster1-1132043742511 true ==> cluster2/logs/localhost.2005-11-15.log <== Nov 15, 2005 9:35:42 AM org.apache.catalina.core.ApplicationContext log INFO: SessionListener: attributeAdded('BA25CFCE25F1E939EEF6F559D0A721B1.cluster1', 'asdf', '') Nov 15, 2005 9:35:42 AM org.apache.catalina.core.ApplicationContext log INFO: SessionListener: attributeReplaced('BA25CFCE25F1E939EEF6F559D0A721B1.cluster1', 'asdf', '') ==> cluster1/logs/catalina.out <== Nov 15, 2005 9:36:05 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 15, 2005:9:36:05 AM 1 127.0.0.1:4,002 BA25CFCE25F1E939EEF6F559D0A721B1.cluster1-1132043765344 true ==> cluster2/logs/catalina.out <== Nov 15, 2005 9:36:05 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 15, 2005:9:36:05 AM 4 - BA25CFCE25F1E939EEF6F559D0A721B1.cluster1-1132043765344 ==> cluster2/logs/localhost.2005-11-15.log <== Nov 15, 2005 9:36:05 AM org.apache.catalina.core.ApplicationContext log INFO: InvokerFilter(ApplicationFilterConfig[name=Path Mapped Filter, filterClass=filters.ExampleFilter]): 13 milliseconds <inactivity period> ==> cluster2/logs/catalina.out <== Nov 15, 2005 9:36:59 AM org.apache.catalina.cluster.tcp.TcpReplicationThread run SEVERE: TCP Worker thread in cluster caught 'java.io.IOException: Connection reset by peer' closing channel java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207) at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:125) at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:69) ==> cluster1/logs/catalina.out <== Nov 15, 2005 9:37:12 AM org.apache.catalina.cluster.tcp.TcpReplicationThread run SEVERE: TCP Worker thread in cluster caught 'java.io.IOException: Connection reset by peer' closing channel java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207) at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:125) at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:69) Also, when shutting down the nodes I get some exceptions (same for both nodes). This also happens if i turn off keppAliveTimeout, and set waitForAck to be true (as suggested in first mail). cluster2/runner.sh stop Using CATALINA_BASE: /usr/local/cluster2 Using CATALINA_HOME: /home/olve/utvikling/svn-tomcat/tomcat-build/build/build Using CATALINA_TMPDIR: /usr/local/cluster2/temp Using JRE_HOME: /usr/lib/j2sdk1.5-sun [EMAIL PROTECTED]:/usr/local $ ==> cluster2/logs/catalina.out <== Nov 15, 2005 9:54:18 AM org.apache.coyote.http11.Http11BaseProtocol pause INFO: Pausing Coyote HTTP/1.1 on http-8082 Nov 15, 2005 9:54:19 AM org.apache.catalina.core.StandardService stop INFO: Stopping service Catalina Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.session.DeltaManager stop INFO: Manager [/servlets-examples] expiring sessions upon shutdown Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.session.JvmRouteBinderValve stop INFO: JvmRouteBinderValve stopped Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.deploy.FarmWarDeployer stop INFO: Cluster FarmWarDeployer stopped. Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.util.FastQueue remove INFO: FastQueue: queue disabled, remove aborted Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.tcp.ReplicationTransmitter stop INFO: Stopped ClusterSender at cluster Catalina:type=Cluster,host=localhost with name Catalina:type=ClusterSender,host=localhost Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.tcp.ReplicationListener listen SEVERE: Unable to process request in ReplicationListener java.nio.channels.ClosedSelectorException at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:66) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.catalina.cluster.tcp.ReplicationListener.listen(ReplicationListener.java:130) at org.apache.catalina.cluster.tcp.ClusterReceiverBase.run(ClusterReceiverBase.java:394) at java.lang.Thread.run(Thread.java:595) Nov 15, 2005 9:54:19 AM org.apache.catalina.cluster.tcp.ClusterReceiverBase run SEVERE: Unable to start cluster listener. java.lang.NullPointerException at org.apache.catalina.cluster.tcp.ReplicationListener.listen(ReplicationListener.java:182) at org.apache.catalina.cluster.tcp.ClusterReceiverBase.run(ClusterReceiverBase.java:394) at java.lang.Thread.run(Thread.java:595) Nov 15, 2005 9:54:19 AM org.apache.coyote.http11.Http11BaseProtocol destroy INFO: Stopping Coyote HTTP/1.1 on http-8082 Nov 15, 2005 9:54:19 AM org.apache.catalina.core.AprLifecycleListener lifecycleEvent INFO: Failed shutdown of Apache Portable Runtime ==> cluster2/logs/localhost.2005-11-15.log <== Nov 15, 2005 9:54:19 AM org.apache.catalina.core.ApplicationContext log INFO: SessionListener: contextDestroyed() Nov 15, 2005 9:54:19 AM org.apache.catalina.core.ApplicationContext log INFO: ContextListener: contextDestroyed() Nov 15, 2005 9:54:19 AM org.apache.catalina.core.ApplicationContext log INFO: SessionListener: contextDestroyed() Nov 15, 2005 9:54:19 AM org.apache.catalina.core.ApplicationContext log INFO: ContextListener: contextDestroyed() ==> cluster1/logs/catalina.out <== Nov 15, 2005 9:54:23 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster memberDisappeared INFO: Received member disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://127.0.0.1:4002,catalina,127.0.0.1,4002, alive=1269259] Nov 15, 2005 9:54:23 AM org.apache.catalina.cluster.util.FastQueue remove INFO: FastQueue: queue disabled, remove aborted
signature.asc
Description: Dette er en digitalt signert meldingsdel