Thanks for the help Shawn and Walter.  After increasing the open files setting 
to 128000 and increasing the JVM-Memory to 16 GB, I was able to load my 
documents.

I now have a collection with 2.3 T rows / ~480 GB running on a 4-node cluster.  
I have found that complicated queries (searching for two search terms in a 
field with "AND" for example), often timeout.  If I try multiple times the 
query does eventually complete.  I'm assuming this is a caching / warm-up issue.

Is there a configuration option I can use to cache the indexes for one of the 
columns or increase the timeout?  Any other advice to get this performing 
quicker is appreciated.

Thanks again,
Jon

-----Original Message-----
From: Shawn Heisey <apa...@elyograg.org> 
Sent: Thursday, July 1, 2021 6:48 PM
To: users@solr.apache.org
Subject: Re: Solr nodes crashing

On 7/1/2021 4:23 PM, Jon Morisi wrote:
> I've had an indexing job running for 24+ hours.  I'm importing 100m+ 
> documents.  After about 8 hours both of the replica nodes crashed but the 
> primary nodes have continued to run and index.

There's a common misconception.  Java programs, including Solr, almost never 
crash.

If you've started a recent Solr version on a platform other than Windows, then 
Solr is started with a Java option that runs a script whenever an 
OutOfMemoryError exception is thrown by the program.  What that script does is 
simple -- it logs a line to a logfile and then kills Solr with the -9 (kill) 
signal.  Note that there are a number of resource depletion scenarios, other 
than memory, which can result in an OutOfMemoryError.  That's why you were 
asked about open file and process limits.

Most operating systems also have what has been named the "oom killer". 
When system memory becomes extremely tight, the OS will find programs using a 
lot of memory and kill one of them.

These two things will LOOK like a crash, but they're not really crashes.

> JVM-Memory 50.7%
> 981.38 MB
> 981.38 MB
> 497

This indicates that your max heap setting for Solr is in the ballpark of 1GB.  
This is extremely small, and so you're probably throwing OutOfMemoryError 
because of heap space.  Which, on a non-Windows system, will basically cause 
Solr to commit suicide.  It does this because when OOME is thrown, program 
operation becomes completely unpredictable, and index corruption is a very real 
possibility.

There are precisely two ways to deal with OOME.  One is to increase the size of 
the resource that is being depleted.  The other is to change the program or the 
program configuration so that it doesn't require as much of that resource.  
Often, especially with Solr, the second option is simply not possible.

Most likely you're going to need to increase Solr's heap far beyond 1GB. 
  There's no way for us to come up with a recommendation for you without asking 
you a lot of very detailed questions about your setup ... and even with that, 
it's possible that we would give you an incorrect recommendation.  I'll give 
you a number, and warn you that it could be wrong, either way too small or way 
too large.  Try an 8GB heap.  You have lots of memory in this system, 8GB is 
barely a drop in the bucket.

Thanks,
Shawn

Reply via email to