Hi again, I try the config using keepAliveTime to 10:
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" timeout="60000" keepAliveTime="10" keepAliveCount="0"/> One more time, the cluster is not working, the big problem is that I cannot reproduce the error at my backup server that works perfectly. Node 2, drops a log error at 12:58 AM, then, at the same time, node 1 report "ClusterError" continuously (Continuous errors are on every hit; the server supports 1 hit per second) Logs: NODE 2 - LOG ============= Jan 31, 2008 12:58:13 PM org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts WARNING: Channel key is registered, but has had no interest ops for the last 3000 ms. (cancelled:false):[EMAIL PROTECTED] last access:2008-01-31 12:58:10.208 NODE 1 - LOG ============= Jan 31, 2008 12:58:04 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Received memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc alhost:4002,localhost,4002, alive=101194547,id={123 -66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={}, ]] message. Will verify. Jan 31, 2008 12:58:04 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Verification complete. Member still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002, localhost,4002, alive=101194547,id={123 -66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={}, ]] Jan 31, 2008 12:58:04 PM org.apache.catalina.ha.tcp.SimpleTcpCluster send SEVERE: Unable to send message through cluster sender. org.apache.catalina.tribes.ChannelException: Operation has timed out(60000 ms.).; Faulty members:tcp://localhost:4002; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral lelNioSender.java:97) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po oledParallelSender.java:53) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl icationTransmitter.java:80) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord inator.java:78) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI nterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess age(ThroughputInterceptor.java:61) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI nterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen dMessage(MessageDispatchInterceptor.java:73) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI nterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage (TcpFailureDetector.java:87) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI nterceptorBase.java:75) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175) at org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835) at org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust er.java:814) at org.apache.catalina.ha.tcp.ReplicationValve.send(ReplicationValve.java:551) at org.apache.catalina.ha.tcp.ReplicationValve.sendMessage(ReplicationValve.jav a:535) at org.apache.catalina.ha.tcp.ReplicationValve.sendSessionReplicationMessage(Re plicationValve.java:517) at org.apache.catalina.ha.tcp.ReplicationValve.sendReplicationMessage(Replicati onValve.java:428) at org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:362 ) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http 11Protocol.java:584) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Jan 31, 2008 12:58:07 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Received memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc alhost:4002,localhost,4002, alive=101197553,id={123 -66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={}, ]] message. Will verify. Jan 31, 2008 12:58:07 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Verification complete. Member still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002, localhost,4002, alive=101197553,id={123 -66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={}, ]] [...] repeats on every hit. ======================== I cannot understand the node 2 log, why is the node 2 crashing?? What can I do?? Thanks on advance. Raúl. -----Mensaje original----- De: Filip Hanik - Dev Lists [mailto:[EMAIL PROTECTED] Enviado el: lunes, 28 de enero de 2008 1:45 Para: Tomcat Users List Asunto: Re: Tomcat 6 - Cluster error. I'd set keepAliveTime to 10 as well, Filip Raúl García wrote: > Hi Again, once again thanks for your time, but we still have problems, > > We applied the "keepAliveCount=0" param. and last Wednesday 23 Jan we > restart both nodes. > > Around 11 hour after the startup, node 1 reports a new error, but both nodes > are working perfectly. > > I cannot imagine why the member disappear unexpectedly, I repost the error, > and the config files. > > INSTANCE 1 - LOG > ================ > Jan 24, 2008 10:25:54 PM > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector > memberDisappeared > INFO: Received > memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc > alhost:4002,localhost,4002, alive=123412856,id={-31 -91 -122 -60 -58 -5 68 > 25 -87 13 -20 -12 -100 5 -16 94 }, payload={}, command={}, domain={}, ]] > message. Will verify. > Jan 24, 2008 10:25:54 PM > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector > memberDisappeared > INFO: Verification complete. Member still > alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002, > localhost,4002, alive=123412856,id={-31 -91 -122 -60 -58 -5 68 25 -87 13 -20 > -12 -100 5 -16 94 }, payload={}, command={}, domain={}, ]] > Jan 24, 2008 10:25:54 PM org.apache.catalina.ha.tcp.SimpleTcpCluster send > SEVERE: Unable to send message through cluster sender. > org.apache.catalina.tribes.ChannelException: Operation has timed out(60000 > ms.).; Faulty members:tcp://localhost:4002; > at > org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral > lelNioSender.java:97) > at > org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po > oledParallelSender.java:53) > at > org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl > icationTransmitter.java:80) > at > org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord > inator.java:78) > at > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI > nterceptorBase.java:75) > at > org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess > age(ThroughputInterceptor.java:61) > at > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI > nterceptorBase.java:75) > at > org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen > dMessage(MessageDispatchInterceptor.java:73) > at > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI > nterceptorBase.java:75) > at > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage > (TcpFailureDetector.java:87) > at > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI > nterceptorBase.java:75) > at > org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216) > at > org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175) > at > org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835) > at > org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust > er.java:814) > at > org.apache.catalina.ha.tcp.ReplicationValve.send(ReplicationValve.java:551) > at > org.apache.catalina.ha.tcp.ReplicationValve.sendMessage(ReplicationValve.jav > a:535) > at > org.apache.catalina.ha.tcp.ReplicationValve.sendSessionReplicationMessage(Re > plicationValve.java:517) > at > org.apache.catalina.ha.tcp.ReplicationValve.sendReplicationMessage(Replicati > onValve.java:428) > at > org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:362 > ) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http > 11Protocol.java:584) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) > at java.lang.Thread.run(Thread.java:619) > > Jan 24, 2008 10:26:54 PM > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector > memberDisappeared > INFO: Received memberDisappeared [...] repeats only once again. > > Jan 25, 2008 5:37:52 AM > org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report > INFO: ThroughputInterceptor Report[ > Tx Msg:66167 messages > Sent:37.02 MB (total) > Sent:37.02 MB (application) > Time:118.53 seconds > Tx Speed:0.31 MB/sec (total) > TxSpeed:0.31 MB/sec (application) > Error Msg:2 > Rx Msg:90000 messages > Rx Speed:0.00 MB/sec (since 1st msg) > Received:41.06 MB] > > > > > INSTANCE-1 --- Server.xml > ========================== > NOTE:: 111.111.111.111 is the server ip address. > ========================== > <Server port="8006" shutdown="SHUTDOWN" debug="0"> > <Listener className="org.apache.catalina.core.JasperListener" debug="0"/> > <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" > debug="0"/> > <Listener > className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener" > debug="0"/> > > <GlobalNamingResources> > <Environment name="InstanceName" type="java.lang.String" value="pro1"/> > > <Resource name="UserDatabase" auth="Container" > type="org.apache.catalina.UserDatabase" > description="User database that can be updated and saved" > factory="org.apache.catalina.users.MemoryUserDatabaseFactory" > pathname="conf/tomcat-users.xml" /> > </GlobalNamingResources> > > <Service name="Catalina"> > > <Connector port="8081" protocol="HTTP/1.1" maxHttpHeaderSize="8192" > emptySessionPath="true" > maxThreads="150" minSpareThreads="100" maxSpareThreads="300" > enableLookups="false" redirectPort="81443" acceptCount="1000" > debug="0" connectionTimeout="20000" > disableUploadTimeout="true" > compression="on" > compressionMinSize="2048" > noCompressionUserAgents="gozilla, traviata" > compressableMimeType="text/html,text/xml" /> > > <Engine name="Catalina" defaultHost="localhost" debug="0" > jvmRoute="PR1"> > <Cluster > className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > channelSendOptions="6"> > > <Manager className="org.apache.catalina.ha.session.DeltaManager" > expireSessionsOnShutdown="false" > notifyListenersOnReplication="true"/> > > <Channel > className="org.apache.catalina.tribes.group.GroupChannel"> > <Membership > className="org.apache.catalina.tribes.membership.McastService" > address="228.0.0.4" > port="45564" > frequency="1000" > dropTime="30000"/> > <Receiver > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > address="127.0.0.1" > port="4001" > autoBind="100" > selectorTimeout="5000" > maxThreads="12"/> > > <Sender > className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> > <Transport > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" > timeout="60000" keepAliveCount="0"/> > </Sender> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector" > /> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In > terceptor"/> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept > or"/> > </Channel> > > <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" > > filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/> > > <Deployer > className="org.apache.catalina.ha.deploy.FarmWarDeployer" > tempDir="/tmp/war-temp/" > deployDir="/tmp/war-deploy/" > watchDir="/tmp/war-listen/" > watchEnabled="false"/> > > <ClusterListener > className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/> > <ClusterListener > className="org.apache.catalina.ha.session.ClusterSessionListener"/> > </Cluster> > <Realm className="org.apache.catalina.realm.UserDatabaseRealm" > debug="0" resourceName="UserDatabase"/> > <Host name="localhost" debug="0" appBase="webapps" > unpackWARs="true" autoDeploy="true" > xmlValidation="false" xmlNamespaceAware="false"> > <Valve className="org.apache.catalina.valves.RemoteAddrValve" > allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/> > </Host> > </Engine> > </Service> > </Server> > ============================================== > > > INSTANCE-2 server.xml > ===================== > <Server port="8007" shutdown="SHUTDOWN" debug="0"> > > <Listener className="org.apache.catalina.core.JasperListener" debug="0"/> > <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" > debug="0"/> > <Listener > className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener" > debug="0"/> > > <GlobalNamingResources> > > <Environment name="InstanceName" type="java.lang.String" value="pro2"/> > > <Resource name="UserDatabase" auth="Container" > type="org.apache.catalina.UserDatabase" > description="User database that can be updated and saved" > factory="org.apache.catalina.users.MemoryUserDatabaseFactory" > pathname="conf/tomcat-users.xml"/> > </GlobalNamingResources> > > <Service name="Catalina"> > > <Connector port="8082" protocol="HTTP/1.1" maxHttpHeaderSize="8192" > emptySessionPath="true" > maxThreads="150" minSpareThreads="100" maxSpareThreads="300" > enableLookups="false" redirectPort="82443" acceptCount="1000" > debug="0" connectionTimeout="20000" > disableUploadTimeout="true" > compression="on" > compressionMinSize="2048" > noCompressionUserAgents="gozilla, traviata" > compressableMimeType="text/html,text/xml" /> > <Engine name="Catalina" defaultHost="localhost" debug="0" > jvmRoute="PR2"> > > <Cluster > className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > channelSendOptions="6"> > > > <Manager className="org.apache.catalina.ha.session.DeltaManager" > expireSessionsOnShutdown="false" > notifyListenersOnReplication="true"/> > > <Channel > className="org.apache.catalina.tribes.group.GroupChannel"> > <Membership > className="org.apache.catalina.tribes.membership.McastService" > address="228.0.0.4" > port="45564" > frequency="1000" > dropTime="30000"/> > <Receiver > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > address="127.0.0.1" > port="4002" > autoBind="100" > selectorTimeout="5000" > maxThreads="12"/> > > <Sender > className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> > <Transport > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" > timeout="60000" keepAliveCount="0"/> > </Sender> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector" > /> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In > terceptor"/> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept > or"/> > </Channel> > > <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" > > filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/> > <!-- <Valve > className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> --> > > <Deployer > className="org.apache.catalina.ha.deploy.FarmWarDeployer" > tempDir="/tmp/war-temp/" > deployDir="/tmp/war-deploy/" > watchDir="/tmp/war-listen/" > watchEnabled="false"/> > > <ClusterListener > className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/> > <ClusterListener > className="org.apache.catalina.ha.session.ClusterSessionListener"/> > </Cluster> > > <Realm className="org.apache.catalina.realm.UserDatabaseRealm" > resourceName="UserDatabase" debug="0"/> > > <Host name="localhost" debug="0" appBase="webapps" > unpackWARs="true" autoDeploy="true" > xmlValidation="false" xmlNamespaceAware="false"> > > <Valve className="org.apache.catalina.valves.RemoteAddrValve" > allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/> > </Host> > </Engine> > </Service> > </Server> > =============================== > > -----Mensaje original----- > De: Filip Hanik - Dev Lists [mailto:[EMAIL PROTECTED] > Enviado el: jueves, 17 de enero de 2008 19:01 > Para: Tomcat Users List > Asunto: Re: Tomcat 6 - Cluster error. > > already replied to your old thread > > ok, it looks like you might have ended up with a rogue socket, > and what happens is that any message sent to that socket just gets lost > in the ether, since it doesn't have any interest ops. > There is a workaround for this, turn off keep alives all together, or > implement a keep alive timeout > > Option 1 - no keep alives at all > > <Transport > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" > timeout="60000" > keepAliveCount="0"/> > > Option 2 - implement a keep alive timeout > > <Transport > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" > timeout="60000" > keepAliveTime="120000"/> > > or make a combination of both values > > either option should work for you. > > On a side note, I'm interested if the scenario you run into is > reproducible, it keeps happening over and over again, then if possible, > I'd like to get some debug logs from you > > Filip > > > > > --------------------------------------------------------------------- > To start a new topic, e-mail: users@tomcat.apache.org > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To start a new topic, e-mail: users@tomcat.apache.org > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]