On 28/04/2020 06:42, Mahathi Vavilala wrote: > Hi, > > For default values of frequency (=500) and dropTime (=3000) in cluster > installation with TOMEE 7.0.7 (uses Tomcat 8.5.50.0) using JDK 12 or above > only, many memberDisappeared messages are logged on all nodes of the > cluster (almost for every 2 minutes). This is occuring even when the system > is idle. In the JDK 12 environment, on increasing the dropTime to 10000, we > identified that they are scarcely generated. The exact same setup works > without any problem when configured with JDK 11. > > We also reproduced this issue on Tomcat-only (v8.5.54 and v8.5.50) cluster > setup using Oracle JDK 12.0.2 (64-bit, java version "12.0.2" 2019-07-16) > and JDK14 (64-bit, java version "14" 2020-03-17).
It shouldn't make any difference but try testing with the latest Adopt OpenJDK releases for 11, 12 and 14. > *Cluster setup details:* > 1) *Nodes Configurations:* > Count : 2 > Resources : 1-cpu 2.29 GHZ, 1-core and 4 GB RAM. Physical machines? VMs on local hardware? VMs in the cloud? What networking is used between the nodes? Do any other machines have access to the network? Might anything else be generating network traffic? > 2)* Server.xml Configuration:* > a) We used the default cluster settings as per Tomcat 8.5 docs ( > https://tomcat.apache.org/tomcat-8.5-doc/cluster-howto.html) using > asynchronous replication mode. > b) Also for non-default channelSendOptions: > <Cluster > className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > channelSendOptions="6"> > In both instances, we reproduced the issue. > > Please note that were able to* consistently reproduce the issue with 1-cpu > nodes with default dropTime value*. However, we are unable to reproduce it > with increased cpu resources (cpu count >= 2) or higher dropTime (>= 8500 > ms). > > We would like to ascertain, if this is an environment issue and is caused > by 1-cpu configured nodes? > > P.S - This was filed in bugzilla, > https://bz.apache.org/bugzilla/show_bug.cgi?id=64312 where I was unable to reproduce it with a 4 node, 1-cpu cluster. Some other things to try: Use Wireshark or similar to examine the network traffic around the times of the failures. Are messages no being generated in a timely manner? Are they generated but appear not to be read in a timely manner? Use a profiler (I use YourKit because they give committers of OSS projects a free license to use with that project) to see what each node is doing. Is it heavily loaded? Is there excessive GC? What are the relevant cluster threads doing? For all of the above, compare a working system and one with the issue. What is different? Mark --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org