Exact Tomcat version?
Is this on pysical machines or on VMs?
Are there associated warning messages in the logs before the failure
message about retries?
I've looked though the relevant cluster code and I don't see anything
obvious that could cause this in terms of a Tomcat bug. Increasing
maxRetryAttempts and/or timeout may help.
Mark
On 02/12/2022 14:11, Dave B wrote:
I'm having intermittent failures when I deploy to a cluster. I see the
war file sent to slave nodes but it then becomes zero size. It happens
on different nodes and not all the time.
Upon failure, Master node .out shows
SEVERE [Catalina-utility-1]
org.apache.catalina.ha.tcp.SimpleTcpCluster.send Unable to send message
through cluster sender.
org.apache.catalina.tribes.ChannelException: Send failed,
attempt:[1] max:[1]; Faulty members:tcp://{172, xx, xx, xx}:5222;
at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:217)
at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:78)
at
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:51)
at
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:65)
at
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:83)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89)
at
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMessage(ThroughputInterceptor.java:62)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89)
at
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:93)
Slave node .out shows
WARNING [Tribes-Task-Receiver[localhost-Channel]-7]
org.apache.catalina.tribes.group.GroupChannel.messageReceived Error
receiving message:
java.lang.NullPointerException
at
org.apache.catalina.ha.deploy.FileMessageFactory.writeMessage(FileMessageFactory.java:247)
at
org.apache.catalina.ha.deploy.FarmWarDeployer.messageReceived(FarmWarDeployer.java:226)
at
org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:821)
at
org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:803)
at
org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:345)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
at
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:118)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
at
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.messageReceived(ThroughputInterceptor.java:94)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
at
org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:288)
at
org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:272)
at
org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:229)
at
org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:103)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
and here is the cluster section of master node server.xml
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="6">
<Manager
className="org.apache.catalina.ha.session.BackupManager"
expireSessionsOnShutdown="false"
notifyListenersOnReplication="true"
sessionAttributeValueClassNameFilter=".+"
mapSendOptions="6"/>
<Channel
className="org.apache.catalina.tribes.group.GroupChannel">
<Membership
className="org.apache.catalina.tribes.membership.McastService"
address="xxx.xxx.xxx.xxx"
port="xxxx"
frequency="500"
dropTime="5000"
localLoopbackDisabled="false"/>
<Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="auto"
port="5221"
selectorTimeout="100"
maxThreads="20"
timeout="5000"
autoBind="1000"/>
<Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="5000"/>
</Sender>
<Interceptor
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
connectTimeout="5000"/>
<Interceptor
className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
<Interceptor
className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
</Channel>
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/>
<Deployer
className="org.apache.catalina.ha.deploy.FarmWarDeployer"
tempDir="/apps/tomcat/env22_node1/temp/"
deployDir="/apps/tomcat/env22_node1/webapps/"
watchDir="/apps/deployments/tomcat/env22/"
watchEnabled="true"/>
<ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
Is this enough info for someone to suggest a fix?
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org