Thanks for the response Walter. Upon further review it looks like my solr service account has: open files (-n) 128000 max user processes (-u) 65536
When I submit documents to be indexed does it run under my account (the logged in account), or does it run under the service account? -----Original Message----- From: Walter Underwood <[email protected]> Sent: Friday, June 25, 2021 12:39 PM To: [email protected] Subject: Re: Solr nodes crashing That is not correctly configured. The open files setting is too small. The documentation you reference says to set files and processes to 65,000. Yours are set to 1024 and 4096, respectively. wunder Walter Underwood [email protected] http://observer.wunderwood.org/ (my blog) > On Jun 25, 2021, at 11:35 AM, Jon Morisi <[email protected]> wrote: > > Hi everyone, > I'm running solr 7.4.0 and have a collection running on 4 nodes (2 shards, > replication factor =2). I'm experiencing an issue where random nodes will > crash when I submit large batches to be indexed (>500,000 documents). I've > been successful in keeping things running if I keep an eye on it and restart > nodes after they crash. Sometimes I end up with a non-recoverable replicated > shard which I fix by dropping the replica and re-adding. > > I've also been successful, no crashing, if I batch inserts in sizes < 500,000 > documents, so that's my workaround for now. > > I'm wondering if anyone can help point me in the right direction for > troubleshooting this issue, so that I can send upwards of 100m documents at a > time. > > From the logs, I have the following errors: > SolrException.java:148) - java.io.EOFException > org.apache.solr.update.ErrorReportingConcurrentUpdateSolrClient > (StreamingSolrClients.java:147) - error > > I did see this: > https://solr.apache.org/guide/7_3/taking-solr-to-production.html#file- > handles-and-processes-ulimit-settings > > I'm running RHEL, does this look correctly configured? > ulimit -a > core file size > (blocks, -c) 0 > data seg size > (kbytes, -d) unlimited > scheduling > priority (-e) 0 > file size > (blocks, -f) unlimited > pending > signals (-i) 1544093 > max locked > memory (kbytes, -l) 64 > max memory > size (kbytes, -m) unlimited > open files > (-n) 1024 > pipe size > (512 bytes, -p) 8 > POSIX message > queues (bytes, -q) 819200 > real-time > priority (-r) 0 > stack size > (kbytes, -s) 8192 > cpu time > (seconds, -t) unlimited > max user > processes (-u) 4096 > virtual memory > (kbytes, -v) unlimited > file locks > (-x) unlimited > > cat /proc/sys/fs/file-max > > 39208945 > > I was thinking of scheduling a job to log the output of cat > /proc/sys/fs/file-nr every 5 minutes or so on my next attempt in an attempt > to validate this setting is not an issue. > > Any other ideas? > > TIA, > Jon
