On 1/18/23 02:13, Rohit Walecha wrote:
We have a 3 node *solr(8.8.0)* cluster deployed on multiple environments
which is connected to a 3 node *zookeeper(3.6.2)* cluster And, we have
been facing frequent restarts of solr cloud nodes since the last few
months..tried to debug this and while looking into the logs and other
stats we have been seeing that the node which has restarted says :
Out of the box, Solr does NOT have any built-in functionality that will
restart it if it goes down. That must have been added.
If a system is sized appropriately and doesn't have any issues with the
hardware or the system software (including Java), then Solr should never
crash.
Most of the time when Solr actually does go down, it is for one of two
reasons:
1) The operating system's "out of memory killer" process was triggered
because of system memory pressure, which found the largest memory
consuming program and killed it. Fixing that often requires adding
memory to the server.
2) While running Solr, Java encountered an "OutOfMemoryError" exception.
For 8.8.0, if you're running on a non-windows platform, Solr includes
functionality that makes it kill itself on OOME. Starting in 9.2.0,
which is not yet released, that functionality comes to Solr running on
Windows too. Note that there are several different resource depletion
conditions that result in OOME, and not all of them are actually related
to memory. Very often when OOME is thrown, Solr will not actually log
the exception. In 9.2.0 the reason for the OOME will always be logged.
I don't really know what the zookeeper logs are saying, but Solr should
never die if everything is sized appropriately. That is probably an
indication of a problem that needs correcting.
The changes coming in version 9.2.0 can be applied to your 8.8.0 version
with the info in the patch on the following issue:
https://issues.apache.org/jira/browse/SOLR-8803
You would only need the changes to bin/solr or bin/solr.cmd. The code
changes are not necessary for the new functionality. They are there so
info about the error log is included in solr.log.
Thanks,
Shawn