That is not correctly configured. The open files setting is too small. The documentation you reference says to set files and processes to 65,000. Yours are set to 1024 and 4096, respectively.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 25, 2021, at 11:35 AM, Jon Morisi <jon.mor...@hsc.utah.edu> wrote: > > Hi everyone, > I'm running solr 7.4.0 and have a collection running on 4 nodes (2 shards, > replication factor =2). I'm experiencing an issue where random nodes will > crash when I submit large batches to be indexed (>500,000 documents). I've > been successful in keeping things running if I keep an eye on it and restart > nodes after they crash. Sometimes I end up with a non-recoverable replicated > shard which I fix by dropping the replica and re-adding. > > I've also been successful, no crashing, if I batch inserts in sizes < 500,000 > documents, so that's my workaround for now. > > I'm wondering if anyone can help point me in the right direction for > troubleshooting this issue, so that I can send upwards of 100m documents at a > time. > > From the logs, I have the following errors: > SolrException.java:148) - java.io.EOFException > org.apache.solr.update.ErrorReportingConcurrentUpdateSolrClient > (StreamingSolrClients.java:147) - error > > I did see this: > https://solr.apache.org/guide/7_3/taking-solr-to-production.html#file-handles-and-processes-ulimit-settings > > I'm running RHEL, does this look correctly configured? > ulimit -a > core file size > (blocks, -c) 0 > data seg size > (kbytes, -d) unlimited > scheduling > priority (-e) 0 > file size > (blocks, -f) unlimited > pending > signals (-i) 1544093 > max locked > memory (kbytes, -l) 64 > max memory > size (kbytes, -m) unlimited > open files > (-n) 1024 > pipe size > (512 bytes, -p) 8 > POSIX message > queues (bytes, -q) 819200 > real-time > priority (-r) 0 > stack size > (kbytes, -s) 8192 > cpu time > (seconds, -t) unlimited > max user > processes (-u) 4096 > virtual memory > (kbytes, -v) unlimited > file locks > (-x) unlimited > > cat /proc/sys/fs/file-max > 39208945 > > I was thinking of scheduling a job to log the output of cat > /proc/sys/fs/file-nr every 5 minutes or so on my next attempt in an attempt > to validate this setting is not an issue. > > Any other ideas? > > TIA, > Jon