Re: Solr Cloud Node re-join issue

Shawn Heisey Fri, 10 Dec 2021 22:24:51 -0800

On 12/10/2021 12:38 PM, Scott wrote:

Having a bit of  weird issue.


We run a 4 node Solr Cloud , version 8.6.2 and for the most part it's been
going quite well for more than 2 years now. We have to restart them
occasionally to free up ram but I guess that's normal.

If you have to restart because it's using too much memory, thensomething is not configured right. If the java heap is sizedappropriately, and the machine is not being used to handle softwareother than Solr, it is pretty much impossible for a java program likeSolr to take too much memory.

Last night one of the nodes went into swap, used up all memory and crashed.
Somehow the way it crashed, it also removed all local cores/data. The
cluster kept on chugging along which was fine, but now I can't get the
crashed node to resync with the others.

Assuming again that Solr is the only significant memory-using process onthe system, and the heap is sized appropriately, then that system shouldNEVER use significant amounts of swap.

I'm betting that you have configured Solr with a max heap size that'stoo large for the system it's running on. Because Java uses a garbagecollection memory model, almost any Java program will eventually use theentire max heap size it has been given, even if it does not actuallyneed that much memory. This is expected.

The most likely reason for a SolrCloud node to delete all cores is thatit connects to a zookeeper ensemble that does not contain SolrCloudcluster config data, or contains a cluster config that's not the correctone. See this issue:


https://issues.apache.org/jira/browse/SOLR-13396

Thanks,
Shawn

Re: Solr Cloud Node re-join issue

Reply via email to