Hi, I tried with separate volumes on each instance and the results are still slow. The addition of more rows in the search query causes the search time to increase by multiple folds. The QTime and Elapsed time are both increased for request GET_TOP_GROUPS. Following is the group field definition.
<fieldType name="string" class="solr.StrField" sortMissingLast="true" stored="false" omitNorms="true"/> I also tried GC tuning with some ExperimentalVMOptions and noticed the speed of search improved a little bit with Provisioned IOPS SSD (io2) and 10000 IOPS but it is not matching with the speed of Solr-6.5.1 on EFS. I have tried many options but not getting the desired performance. I have also validated the configurations and schema. Not sure what is the reason for the slowness or am I missing any configurations. Kindly advise. Thanks, Modassar On Fri, Apr 8, 2022 at 11:04 AM Modassar Ather <modather1...@gmail.com> wrote: > Thanks Walter for your reply. Yes it is the same disk shared on all > instances. > > Thanks, > Modassar > > On Fri, Apr 8, 2022 at 10:54 AM Walter Underwood <wun...@wunderwood.org> > wrote: > >> Are you sharing the same disk volume on all instances? I would expect >> that to be slow and cause index corruption. Each instance should have its >> own disk volumes. I’m looking at this part of your config. >> >> Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000 >> IOPS. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >> > On Apr 7, 2022, at 10:07 PM, Modassar Ather <modather1...@gmail.com> >> wrote: >> > >> > Hi, >> > >> > I tried a few different settings of GC and observed the following. The >> best >> > result I got with the following environment and GC settings but still >> it is >> > comparatively slower than the previous Solr-6.5.1 setup. >> > >> > Total index : 4+ TB >> > Servers : 3 instances of x2gd.4xlarge systems each having 16 CPUs and >> 256 >> > GB RAM. >> > Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000 >> > IOPS. >> > >> > GC settings >> > 1. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g >> > -XX:+UseStringDeduplication -XX:MaxTenuringThreshold=8 >> > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled >> > -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=55" >> > >> > 2. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g >> > -XX:MaxTenuringThreshold=8 -XX:+PerfDisableSharedMem >> > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=100 >> > -XX:InitiatingHeapOccupancyPercent=55" >> > >> > The second GC setting is better for our environment but even with these >> > settings we are facing issues of slowness even for simple term queries. >> > The memory is not highly utilised and CPU% is also not very high. The >> > slowness increases if we try to fetch more rows per request as well. >> > >> > Please provide your inputs. >> > >> > Thanks, >> > Modassar >> > >> > >> > >> > On Mon, Mar 28, 2022 at 2:51 AM Shawn Heisey <apa...@elyograg.org> >> wrote: >> > >> >> On 3/27/2022 1:20 PM, Modassar Ather wrote: >> >>> Just to add one point, even the queries without the wildcards e.g. a >> >>> boolean query or a query with 10000 ids ORed has also become slow and >> it >> >> is >> >>> also taking more CPU and finally ending up taking more time. >> >>> I understand this is due to many GC pauses so if we fine tune the GC >> >>> settings the CPU utilisation should go down. >> >> >> >> The GC settings that Solr 8.x comes with out of the box are already >> very >> >> good. Disclaimer: They are very similar to settings that I came up >> with >> >> after some intense testing work. >> >> >> >> If you are running a recent release of Java 11 or OpenJDK 11 and have a >> >> test environment, you could try Shenandoah. My testing shows that this >> >> collector makes a significant difference in GC pause activity, but that >> >> throughput takes a definite hit. If your indexing speed is >> sufficiently >> >> fast, you could give these settings a try in your solr.in.sh file: >> >> >> >> GC_TUNE=" \ >> >> -XX:+UseShenandoahGC \ >> >> -XX:+AlwaysPreTouch \ >> >> -XX:+PerfDisableSharedMem \ >> >> -XX:+ParallelRefProcEnabled \ >> >> -XX:+UseStringDeduplication \ >> >> -XX:ParallelGCThreads=2 \ >> >> -XX:+UseNUMA >> >> >> >> Note that UseNUMA will only make a difference if your server has more >> >> than one NUMA node. But it will not harm anything if the server does >> >> not have it. >> >> >> >> I mention indexing speed and throughput because indexing speed is where >> >> I noticed a decrease with Shenandoah. Fully reindexing my dovecot >> >> install (about 150K messages) takes about 8 minutes with G1 and about 9 >> >> minutes with Shenandoah. GC analysis revealed a larger number of >> >> significantly smaller GC pauses, with the total pause time a little >> >> lower. But my co-conspirator on Shendandoah testing (with a much >> larger >> >> index than mine) said that their re-indexing process failed to complete >> >> with Shenandoah, so if you have a test system you can try it on, I >> would >> >> recommend doing that before deploying to production. >> >> >> >> My Shenandoah testing was done with OpenJDK 11.0.3, and that server now >> >> has 11.0.14, which in theory should be a lot more stable. >> >> >> >> I also came up with some good CMS settings. But I think the CMS >> >> collector has been deprecated, though I do not know what version of >> Java >> >> might ultimately remove it. >> >> >> >> https://cwiki.apache.org/confluence/display/solr/ShawnHeisey >> >> >> >> Thanks, >> >> Shawn >> >> >> >> >> >>