We set the HeapRegionSize to 32m and set the heap to 31g. Those changes did not make a difference, if anything they made the problem worse. Prior to the change we would see 2 nodes failing, after the change we see all 4 nodes failing.
We collect all the solr search data into an Elastic Stack index. I noticed that /select calls are not spread equally across all 4 nodes. One node is getting >50% while another is getting quite little. Our Solr search traffic comes in via a REST service that calls solr using CloudSolrClient I would expect more even distribution of searches Could this be leading to our memory issues? What would cause this? -----Original Message----- From: Shawn Heisey <elyog...@elyograg.org> Sent: Thursday, June 24, 2021 2:54 PM To: users@solr.apache.org Subject: Re: Solr GC tuning Advice [WARNING - EXTERNAL EMAIL] Do not open links or attachments unless you recognize the sender of this email. If you are unsure please click the button "Report suspicious email" On 6/24/2021 8:55 AM, Webster Homer wrote: > SOLR_JAVA_MEM="-Xms32768m -Xmx32768m" Change "32768m" to "31g". You'll actually have MORE heap available at 31GB than at 32GB. Screwy, I know. It's because at 32GB, Java has to use 64-bit pointers. Below that, it can use 32-bit. > On several posts on Solr GC configuration I noticed that > -XX:+UseLargePages is set, but little to say why this would be useful. > For it to work Large Memory Pages needs to be enabled as described > here > https://clicktime.symantec.com/3H7SSrrxZFifhUuDaT2ToRE6H2?u=https%3A%2 > F%2Fwww.oracle.com%2Fjava%2Ftechnologies%2Fjavase%2Flargememory-pages. > html If the OS does not have Huge Pages set aside, that Java option does nothing at all. If the OS does have huge pages available, then Java will use them if you give it that option. > I'm not an expert in GC tuning so any advice would be appreciated. The key to good GC performance -- make it unnecessary for Java to EVER do a full GC. No matter what collector you have chosen, if Java does a full GC, you're paused, potentially for a REALLY long time. With a heap as large as yours, you can be sure it WILL be slow. I was seeing pauses of 10-15 seconds, and that was with an 8GB heap. Are there ANY humungous allocations still happening? If there are, then the only way Java can collect that garbage is a full GC. Maybe bump the region size up to its max, 32MB? How many documents do you have in your largest cores? Thanks Shawn This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click merckgroup.com/disclaimer<https://www.merckgroup.com/en/legal-disclaimer/mail-disclaimer.html> to access the German, French, Spanish, Portuguese, Turkish, Polish and Slovak versions of this disclaimer. Please find our Privacy Statement information by clicking here merckgroup.com/en/privacy-statement.html<https://www.merckgroup.com/en/privacy-statement.html>