Workaround tested and no difference with or without it. New cloud set up with 1200mb heap for each instance and 32gb system RAM on each server. I'm seeing just over 20gb in system cache. Anti-virus exclusions applied and system doesn't appear to be swapping unnecessarily.
One thing I have noticed is elapsed query time is often better when running without replica's. Still see slow start up on random access where no activity has taken place for 2minutes or more. Cloud was designed some time ago by design architects and under Solr 5.4.1 has been running perfectly fine. This is in use 24/7 so I can't test what it is like when idle. I can only assume at the time the 4 nodes per server was to leverage the 4 CPU's allocated to each machine. This setup has 16gb system RAM for each server so half of what the new cloud has. This contains confidential information so we host in our own datacentres and therefore can not make use of datadoghq. -----Original Message----- From: Jan Høydahl <jan....@cominvent.com> Sent: 02 December 2022 18:47 To: users@solr.apache.org Subject: Re: Solr 9.1 performance WARNING: This email originated from outside of NHS Wales. Do not open links or attachments unless you know the content is safe. What I'm saying is that 9.1 includes a workaround for the cache issues, see https://github.com/apache/solr/blob/releases/solr/9.1.0/solr/bin/solr#L2246-L2250 You may want to try to disable this workaround to see if it helps with the performance of your system. Alternatively try with JDK11, which does not trigger the workaround. But it is just a blind shot, your issues may stem from something else, and we'd need much more details on your setup, config, physical RAM, heap etc. I would like to question the decision of running 4 solr nodes on the same server. Have you tried instead to run one solr process per server, keeping 12 shards and 2 replicas? If you enable affinity placement plugin and tag each node with data-center id and hostname, then solr will place the shards/replicas evenly across all 6 servers. Finally, add some observability to your cluster to learn what is actually going on. You can e.g. use Datadog <https://docs.datadoghq.com/integrations/solr/?tab=host> or another cloud provider to quickly get started. It will help you discover what is happening in your cluster. PS: Have you disabled all Antivirus software? Made sure your heap size is as low as possible? Verified that your system is not swapping? Jan > 2. des. 2022 kl. 17:25 skrev Joe Jones (DHCW - Software Development) > <joe.jo...@wales.nhs.uk.INVALID>: > > No, out of the box 9.1 doesn't include the patch. Tried adding it in and no > difference. > > I've done some testing running the queries with "distrib=false" and can see > the query itself runs fine it's just the call to the instance and the > response is slow. > > Something to do with Jetty? > > -----Original Message----- > From: Jan Høydahl <jan....@cominvent.com> > Sent: 02 December 2022 10:14 > To: users@solr.apache.org > Subject: Re: Solr 9.1 performance > > WARNING: This email originated from outside of NHS Wales. Do not open links > or attachments unless you know the content is safe. > > > Could it be related to > https://solr.apache.org/news.html#java-17-bug-affecting-solr ? Doubt it as > you don't use much caching, but hotspot optimization of caches are disabled > by default in 9.1. You could try to edit bin/solr script to disable the patch > and see if anything is faster - risking a segfault crash instead :) > > Jan > >> 2. des. 2022 kl. 10:11 skrev Joe Jones (DHCW - Software Development) >> <joe.jo...@wales.nhs.uk.INVALID>: > Rydym yn croesawu derbyn gohebiaeth yng Nghymraeg. Byddwn yn ateb y fath > ohebiaeth yng Nghymraeg ac ni fydd hyn yn arwain at oedi. > We welcome receiving correspondence in Welsh. We will reply to such > correspondence in Welsh and this will not lead to a delay.