Thanks Walter for your reply. Yes it is the same disk shared on all instances.
Thanks, Modassar On Fri, Apr 8, 2022 at 10:54 AM Walter Underwood <wun...@wunderwood.org> wrote: > Are you sharing the same disk volume on all instances? I would expect that > to be slow and cause index corruption. Each instance should have its own > disk volumes. I’m looking at this part of your config. > > Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000 > IOPS. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Apr 7, 2022, at 10:07 PM, Modassar Ather <modather1...@gmail.com> > wrote: > > > > Hi, > > > > I tried a few different settings of GC and observed the following. The > best > > result I got with the following environment and GC settings but still it > is > > comparatively slower than the previous Solr-6.5.1 setup. > > > > Total index : 4+ TB > > Servers : 3 instances of x2gd.4xlarge systems each having 16 CPUs and 256 > > GB RAM. > > Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000 > > IOPS. > > > > GC settings > > 1. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g > > -XX:+UseStringDeduplication -XX:MaxTenuringThreshold=8 > > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled > > -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=55" > > > > 2. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g > > -XX:MaxTenuringThreshold=8 -XX:+PerfDisableSharedMem > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=100 > > -XX:InitiatingHeapOccupancyPercent=55" > > > > The second GC setting is better for our environment but even with these > > settings we are facing issues of slowness even for simple term queries. > > The memory is not highly utilised and CPU% is also not very high. The > > slowness increases if we try to fetch more rows per request as well. > > > > Please provide your inputs. > > > > Thanks, > > Modassar > > > > > > > > On Mon, Mar 28, 2022 at 2:51 AM Shawn Heisey <apa...@elyograg.org> > wrote: > > > >> On 3/27/2022 1:20 PM, Modassar Ather wrote: > >>> Just to add one point, even the queries without the wildcards e.g. a > >>> boolean query or a query with 10000 ids ORed has also become slow and > it > >> is > >>> also taking more CPU and finally ending up taking more time. > >>> I understand this is due to many GC pauses so if we fine tune the GC > >>> settings the CPU utilisation should go down. > >> > >> The GC settings that Solr 8.x comes with out of the box are already very > >> good. Disclaimer: They are very similar to settings that I came up with > >> after some intense testing work. > >> > >> If you are running a recent release of Java 11 or OpenJDK 11 and have a > >> test environment, you could try Shenandoah. My testing shows that this > >> collector makes a significant difference in GC pause activity, but that > >> throughput takes a definite hit. If your indexing speed is sufficiently > >> fast, you could give these settings a try in your solr.in.sh file: > >> > >> GC_TUNE=" \ > >> -XX:+UseShenandoahGC \ > >> -XX:+AlwaysPreTouch \ > >> -XX:+PerfDisableSharedMem \ > >> -XX:+ParallelRefProcEnabled \ > >> -XX:+UseStringDeduplication \ > >> -XX:ParallelGCThreads=2 \ > >> -XX:+UseNUMA > >> > >> Note that UseNUMA will only make a difference if your server has more > >> than one NUMA node. But it will not harm anything if the server does > >> not have it. > >> > >> I mention indexing speed and throughput because indexing speed is where > >> I noticed a decrease with Shenandoah. Fully reindexing my dovecot > >> install (about 150K messages) takes about 8 minutes with G1 and about 9 > >> minutes with Shenandoah. GC analysis revealed a larger number of > >> significantly smaller GC pauses, with the total pause time a little > >> lower. But my co-conspirator on Shendandoah testing (with a much larger > >> index than mine) said that their re-indexing process failed to complete > >> with Shenandoah, so if you have a test system you can try it on, I would > >> recommend doing that before deploying to production. > >> > >> My Shenandoah testing was done with OpenJDK 11.0.3, and that server now > >> has 11.0.14, which in theory should be a lot more stable. > >> > >> I also came up with some good CMS settings. But I think the CMS > >> collector has been deprecated, though I do not know what version of Java > >> might ultimately remove it. > >> > >> https://cwiki.apache.org/confluence/display/solr/ShawnHeisey > >> > >> Thanks, > >> Shawn > >> > >> > >