On 11/4/21 6:27 AM, Michael Conrad wrote:
When there is a high segment count processes on the systems start showing very high "i/o wait" with associated idle CPU time according to top. Which indicates to me that CPU core count isn't a main culprit.

SOLR_JAVA_MEM="-Xms1g -Xmx5g"
GC_TUNE=""

Don't set GC_TUNE this way.  I just tried it, and it completely disables all GC tuning, at least on Solr 8.10.1.  It does NOT set the GC tuning to Solr defaults, it sets it to Java defaults, and Java defaults have always been terrible for Solr.  If you want Solr's standard GC tuning, which is actually pretty good, remove GC_TUNE from your solr.in.sh file.  For the Java version you're running, I would recommend either Shenandoah or ZGC ... although G1GC is quite good, if it is tuned further than just turning it on.  I do not remember which version of Solr changed its defaults from CMS to G1.  Although I am not a GC expert, I have done some experimenting with different GC options.

I'm betting that the performance problems aren't due to a high segment count.  I think it's more likely that they are due to memory issues.  I don't have enough information yet to determine which of the two memory-related problems I mentioned you're running into.

Can you share solr_gc.log generated during a time when the iowait gets bad?  Be aware that restarting Solr will rotate that log and it will have a number at the end of the filename after that.  You could share all of the GC logs and indicate which one has the right data in it.  It will PROBABLY be the largest file.

There is also a screenshot that answers a whole bunch of questions about memory use on the server all at once.  How to gather the screenshot is discussed here:

https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue

For devs:

I would have expected an empty GC_TUNE to go with Solr defaults. I see this line in bin/solr (on 8.10.1):

  if [ -z ${GC_TUNE+x} ]; then

I think the +x part should not be there.  With it removed, the script interprets an empty string as undefined and uses the defaults, which I think is correct.  The +x appears in four places in the script.

Thanks,
Shawn


Reply via email to