Thanks for the help Shawn and Walter. After increasing the open files setting to 128000 and increasing the JVM-Memory to 16 GB, I was able to load my documents.
I now have a collection with 2.3 T rows / ~480 GB running on a 4-node cluster. I have found that complicated queries (searching for two search terms in a field with "AND" for example), often timeout. If I try multiple times the query does eventually complete. I'm assuming this is a caching / warm-up issue. Is there a configuration option I can use to cache the indexes for one of the columns or increase the timeout? Any other advice to get this performing quicker is appreciated. Thanks again, Jon -----Original Message----- From: Shawn Heisey <apa...@elyograg.org> Sent: Thursday, July 1, 2021 6:48 PM To: users@solr.apache.org Subject: Re: Solr nodes crashing On 7/1/2021 4:23 PM, Jon Morisi wrote: > I've had an indexing job running for 24+ hours. I'm importing 100m+ > documents. After about 8 hours both of the replica nodes crashed but the > primary nodes have continued to run and index. There's a common misconception. Java programs, including Solr, almost never crash. If you've started a recent Solr version on a platform other than Windows, then Solr is started with a Java option that runs a script whenever an OutOfMemoryError exception is thrown by the program. What that script does is simple -- it logs a line to a logfile and then kills Solr with the -9 (kill) signal. Note that there are a number of resource depletion scenarios, other than memory, which can result in an OutOfMemoryError. That's why you were asked about open file and process limits. Most operating systems also have what has been named the "oom killer". When system memory becomes extremely tight, the OS will find programs using a lot of memory and kill one of them. These two things will LOOK like a crash, but they're not really crashes. > JVM-Memory 50.7% > 981.38 MB > 981.38 MB > 497 This indicates that your max heap setting for Solr is in the ballpark of 1GB. This is extremely small, and so you're probably throwing OutOfMemoryError because of heap space. Which, on a non-Windows system, will basically cause Solr to commit suicide. It does this because when OOME is thrown, program operation becomes completely unpredictable, and index corruption is a very real possibility. There are precisely two ways to deal with OOME. One is to increase the size of the resource that is being depleted. The other is to change the program or the program configuration so that it doesn't require as much of that resource. Often, especially with Solr, the second option is simply not possible. Most likely you're going to need to increase Solr's heap far beyond 1GB. There's no way for us to come up with a recommendation for you without asking you a lot of very detailed questions about your setup ... and even with that, it's possible that we would give you an incorrect recommendation. I'll give you a number, and warn you that it could be wrong, either way too small or way too large. Try an 8GB heap. You have lots of memory in this system, 8GB is barely a drop in the bucket. Thanks, Shawn