Hi We are working on an Ignite project with Cassandra as persistent storage. During our tests we faced the continuous cassandra session refresh issue. https://issues.apache.org/jira/browse/IGNITE-8354
When we observed the above issue we also ran into OutOfMemory Exception. Though the above issue is solved we ran through the source code to find out the root cause of OOM. We found one potential cause. In org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.java when refresh() method is invoked to handle Exceptions, new Cluster is build with same LoadBalancingPolicy Object. We are using RoundRobinPolicy so same RoundRobinPolicy object would be used while building Cluster when refresh() is invoked. In RoundRobinPolicy there is a CopyOnWriteArrayList<Host> liveHosts. When ever init(Cluster cluster, Collection<Host> hosts) is called on RoundRobinPolicy it calls liveHosts.addAll(hosts) adding all the Host Object Collection to liveHosts. When ever Cluster is build during refresh() the Host Collection are added again to the liveHosts of the same RoundRobinPolicy that is used. Thus same Hosts are added again to liveHosts for every refresh() and the size would grow indefinitely after many refresh() calls causing OOM. Even in the heap dump post OOM we found huge number of Objects in liveHosts of RoundRobinPolicy Object. IGNITE-8354 has fixed the OOM by preventing unnecessary refresh() but still does not fix the actual Memory leak caused due to RoundRobinPolicy . In a long run we can have many Cassandra refresh due to some genuine reasons and then we end up with many Hosts in liveHosts of the RoundRobinPolicy Object. Some possible solutions would be 1. To use new LoadBalancingPolicy object while building new Cluster during refresh(). 2. Somehow clear Objects in liveHosts during refresh(). Also there's a work around to use DCAwareRoundRobinPolicy as it uses adds hosts dc wise and adds only if absent. But we are using single datacenter and its not recommended to use DCAwareRoundRobinPolicy when we have single datacenter. I would like to request some one from ignite cassandra module development look into this issue. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/