Hi, First I must congratulate on the improvements of the clustering docs! Now I don't have to guess nearly as much as before when trying to set up clustering. :-)
But I still have some troubles regarding clustering. We have a webapp clustered over three tomcats, all on the same machine, load balanced by mod_jk in an Apache frontend, using sticky lb, session replication by fastasyncqueue. It seems OK - no session dropouts in sticky mode, we can turn on and off tomcats as we please. But, in the logs, we see numerous: SEVERE: TCP Worker thread in cluster caught 'java.io.IOException: Connection reset by peer' closing channel java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207) at o.a.c.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:125) at o.a.c.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:69) I am trying to track these down, as they are marked as SEVERE. The tomcats sits on a Red Hat Linux server. It is configured with two networks, one open to the world, and firewalled, the other is internal, and is fully open. The internal net is used for clustering, db-access, and mod_jk connections. How severe are these? Can they be a result of two near concurrent requests, and the second request cancels the serialisation of the first session object? If not, where should I start to look? Log files and cluster setup follows: cluster part of server.xml (only tcpListenPort differs among our configurations): ================================================================================= <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster" managerClassName="org.apache.catalina.cluster.session.DeltaManager" doClusterLog="true" clusterLogName="clusterlog" expireSessionsOnShutdown="false" useDirtyFlag="true" notifyListenersOnReplication="true" notifySessionListenersOnReplication="true"> <Membership className="org.apache.catalina.cluster.mcast.McastService" mcastAddr="228.0.0.4" mcastPort="45564" mcastFrequency="500" mcastDropTime="3000"/> <Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener" tcpListenAddress="192.168.5.112" tcpListenPort="4003" tcpSelectorTimeout="100" tcpThreadCount="2"/> <Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter" replicationMode="fastasyncqueue" ackTimeout="15000"/> <Valve className="org.apache.catalina.cluster.tcp.ReplicationValve" filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.* \.txt;"/> <Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer" tempDir="${catalina.base}/war-temp" deployDir="${catalina.base}/webapps/" watchDir="${catalina.base}/war-listen/" watchEnabled="false"/> <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/> <Valve className="org.apache.catalina.cluster.session.JvmRouteBinderValve" enabled="true"/> <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/> <ClusterListener className="org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener"/> </Cluster> Logs demonstrating the steps toward exceptions: ================================================================================= #######Request #1, trying to access home page, redirected to login page: #######Access log from apache: i.p.add.ress - - [07/Nov/2005:11:11:00 +0100] "GET /npweb/t-app HTTP/1.1" 302 - i.p.add.ress - - [07/Nov/2005:11:11:00 +0100] "GET /npweb/t-app?service=external/bitecore.loginPage HTTP/1.1" 200 4633 i.p.add.ress - - [07/Nov/2005:11:11:00 +0100] "GET /favicon.ico HTTP/1.1" 404 288 #######Tail from tomcat logs: ==> tomcat1/logs/catalina.out <== Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 7, 2005:11:11:00 AM 1 - B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260068 Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 7, 2005:11:11:00 AM 0 - B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260072 [np-norway] INFO [TP-Processor29] BiteEngine.registerTestFormatConfigs(91) | registering all test format configs [np-norway] INFO [TP-Processor29] BiteEngine.registerTestFormatConfigs(108) | TOTALLY REGISTERED 12 TEST FORMAT CONFIGS IN SYSTEM Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 7, 2005:11:11:00 AM 0 - B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260109 ==> tomcat2/logs/catalina.out <== Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:11:00 AM 0 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260068 true Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:11:00 AM 1 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260072 true Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:11:00 AM 0 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260109 true ==> tomcat3/logs/catalina.out <== Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:11:00 AM 0 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260068 true Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:11:00 AM 1 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260072 true Nov 7, 2005 11:11:00 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:11:00 AM 1 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358260109 true #######Request #2 (after login): #######Access log from apache: i.p.add.ress - - [07/Nov/2005:11:12:03 +0100] "POST /npweb/j_acegi_security_check HTTP/1.1" 302 - i.p.add.ress - - [07/Nov/2005:11:12:03 +0100] "GET /npweb/t-app HTTP/1.1" 200 6454 i.p.add.ress - - [07/Nov/2005:11:12:05 +0100] "GET /favicon.ico HTTP/1.1" 404 288 #######Tail from tomcat logs: ==> tomcat1/logs/catalina.out <== [np-norway] INFO [TP-Processor29] LoggerListener.onApplicationEvent(125) | Authentication success for user: UserName; details: [EMAIL PROTECTED]: RemoteIpAddress: i.p.add.ress; SessionId: B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc1 Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 7, 2005:11:12:03 AM 2 - B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358323369 ==> tomcat2/logs/catalina.out <== Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.TcpReplicationThread run SEVERE: TCP Worker thread in cluster caught 'java.io.IOException: Connection reset by peer' closing channel java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207) at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:125) at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:69) Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:12:03 AM 0 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358323369 true Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 7, 2005:11:12:03 AM 2 - 55AB686217BC200D0D6F026EEC3FA5C7.bserv2tc2-EXPIRED-MSG ==> tomcat3/logs/catalina.out <== Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:12:03 AM 1 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358323369 true Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.TcpReplicationThread run SEVERE: TCP Worker thread in cluster caught 'java.io.IOException: Connection reset by peer' closing channel java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207) at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:125) at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:69) Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:12:03 AM 1 192.168.5.112:4,002 55AB686217BC200D0D6F026EEC3FA5C7.bserv2tc2-EXPIRED-MSG true ==> tomcat1/logs/catalina.out <== Nov 7, 2005 11:12:03 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:12:03 AM 0 192.168.5.112:4,002 55AB686217BC200D0D6F026EEC3FA5C7.bserv2tc2-EXPIRED-MSG true Nov 7, 2005 11:12:04 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage INFO: SEND Nov 7, 2005:11:12:04 AM 0 - B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358324900 ==> tomcat2/logs/catalina.out <== Nov 7, 2005 11:12:04 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:12:04 AM 1 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358324900 true ==> tomcat3/logs/catalina.out <== Nov 7, 2005 11:12:04 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logReceiveMessage INFO: RECEIVE Nov 7, 2005:11:12:04 AM 0 192.168.5.112:4,001 B07DBFF28ACB2BAC0F28FE9DAC65A69B.bserv2tc11131358324900 true
signature.asc
Description: Dette er en digitalt signert meldingsdel