Farm deploy random failures
I'm having intermittent failures when I deploy to a cluster. I see the war file sent to slave nodes but it then becomes zero size. It happens on different nodes and not all the time. Upon failure, Master node .out shows SEVERE [Catalina-utility-1] org.apache.catalina.ha.tcp.SimpleTcpCluster.send Unable to send message through cluster sender. org.apache.catalina.tribes.ChannelException: Send failed, attempt:[1] max:[1]; Faulty members:tcp://{172, xx, xx, xx}:5222; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:217) at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:78) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:51) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:65) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:83) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89) at org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMessage(ThroughputInterceptor.java:62) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:93) Slave node .out shows WARNING [Tribes-Task-Receiver[localhost-Channel]-7] org.apache.catalina.tribes.group.GroupChannel.messageReceived Error receiving message: java.lang.NullPointerException at org.apache.catalina.ha.deploy.FileMessageFactory.writeMessage(FileMessageFactory.java:247) at org.apache.catalina.ha.deploy.FarmWarDeployer.messageReceived(FarmWarDeployer.java:226) at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:821) at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:803) at org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:345) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:118) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.messageReceived(ThroughputInterceptor.java:94) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:288) at org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:272) at org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:229) at org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:103) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) and here is the cluster section of master node server.xml className="org.apache.catalina.tribes.group.GroupChannel"> className="org.apache.catalina.tribes.membership.McastService" address="xxx.xxx.xxx.xxx" port="" frequency="500" dropTime="5000" localLoopbackDisabled="false"/> className="org.apache.catalina.tribes.transport.nio.NioReceiver" address="auto" port="5221" selectorTimeout="100" maxThreads="20" timeout="5000" autoBind="1000"/> className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" timeout="5000"/> className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector" connectTimeout="5000"/> className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/> className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/> className="org.apache.catalina.ha.deploy.FarmWarDeployer"
Re: Farm deploy random failures
Exact Tomcat version? Is this on pysical machines or on VMs? Are there associated warning messages in the logs before the failure message about retries? I've looked though the relevant cluster code and I don't see anything obvious that could cause this in terms of a Tomcat bug. Increasing maxRetryAttempts and/or timeout may help. Mark On 02/12/2022 14:11, Dave B wrote: I'm having intermittent failures when I deploy to a cluster. I see the war file sent to slave nodes but it then becomes zero size. It happens on different nodes and not all the time. Upon failure, Master node .out shows SEVERE [Catalina-utility-1] org.apache.catalina.ha.tcp.SimpleTcpCluster.send Unable to send message through cluster sender. org.apache.catalina.tribes.ChannelException: Send failed, attempt:[1] max:[1]; Faulty members:tcp://{172, xx, xx, xx}:5222; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:217) at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:78) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:51) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:65) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:83) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89) at org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMessage(ThroughputInterceptor.java:62) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:93) Slave node .out shows WARNING [Tribes-Task-Receiver[localhost-Channel]-7] org.apache.catalina.tribes.group.GroupChannel.messageReceived Error receiving message: java.lang.NullPointerException at org.apache.catalina.ha.deploy.FileMessageFactory.writeMessage(FileMessageFactory.java:247) at org.apache.catalina.ha.deploy.FarmWarDeployer.messageReceived(FarmWarDeployer.java:226) at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:821) at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:803) at org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:345) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:118) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.messageReceived(ThroughputInterceptor.java:94) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96) at org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:288) at org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:272) at org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:229) at org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:103) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) and here is the cluster section of master node server.xml className="org.apache.catalina.ha.session.BackupManager" expireSessionsOnShutdown="false" notifyListenersOnReplication="true" sessionAttributeValueClassNameFilter=".+" mapSendOptions="6"/> className="org.apache.catalina.tribes.group.GroupChannel"> className="org.apache.catalina.tribes.membership.McastService" address="xxx.xxx.xxx.xxx" port="" frequency="500" dropTime="5000" localLoopbackDisabled="false"/> className="org.apache.catalina.tribes.transport.nio.NioReceiver" address="auto" port="5221" selectorTimeout="100" maxThreads="20" timeout="5000