Solr 9.1 performance

2022-12-02 Thread Joe Jones (DHCW - Software Development)
Hello all,

We currently have a Solr cloud set up under version 5.4.1 running on Windows.

It contains 45 million records in a collection split across 12 shards with a 
replication factor of 2.  The 12 shards are hosted across 6 servers running 4 
nodes each.  3 servers are in one data center and 3 in another so that we can 
have high availability and site redundancy.

This works great, serves hundreds of queries per minute and handles a trickle 
feed of new documents (20K a day) and updates easily, but runs on old kit.

We've built a new cloud using Solr 9.1 (Eclipse Adoptium OpenJDK 17.0.4) and 
the same topology and settings on new servers and I have found that initial few 
queries on a node are really bad.  P95 times across the board are 6-8K ms.  
It's like the first couple of queries are spinning up a new indexer before 
settling into memory.  Once in memory (or spun up), queries return in <10ms, 
but leave it idle for 90 seconds and we get the slow query again.  I've noticed 
on some occasions the admin UI panel can be slow on query if left idle.  Using 
dev tools in the browser will also show for the first query the page time is 
also greater than the already high query/elapsed time.

We do not autowarm caches due to the nature of the queries being used.  Garbage 
collection looks fine on 1200mb allocation.  They are quick and infrequent, 
usually getting to 70% of 1200mb allocated before being cleaned.

We see the same behaviour without any active indexing taking place so can rule 
that out.  The cloud was previously built on version 9 with the same issue.


The only thing that helps keep queries quick is if I spam requests at the cloud 
constantly.  I can't compare any idle periods on the 5.4 Solr cloud because 
there's always some activity.

Perhaps once this is used in production the activity will keep things alive, 
but is there something else I can look at to keep the cloud active at all times?

Thanks.

Rydym yn croesawu derbyn gohebiaeth yng Nghymraeg. Byddwn yn ateb y fath 
ohebiaeth yng Nghymraeg ac ni fydd hyn yn arwain at oedi.
We welcome receiving correspondence in Welsh. We will reply to such 
correspondence in Welsh and this will not lead to a delay.


Re: Solr 9.1 performance

2022-12-02 Thread Jan Høydahl
Could it be related to 
https://solr.apache.org/news.html#java-17-bug-affecting-solr ? Doubt it as you 
don't use much caching, but hotspot optimization of caches are disabled by 
default in 9.1. You could try to edit bin/solr script to disable the patch and 
see if anything is faster - risking a segfault crash instead :)

Jan

> 2. des. 2022 kl. 10:11 skrev Joe Jones (DHCW - Software Development) 
> :
> 
> Hello all,
> 
> We currently have a Solr cloud set up under version 5.4.1 running on Windows.
> 
> It contains 45 million records in a collection split across 12 shards with a 
> replication factor of 2.  The 12 shards are hosted across 6 servers running 4 
> nodes each.  3 servers are in one data center and 3 in another so that we can 
> have high availability and site redundancy.
> 
> This works great, serves hundreds of queries per minute and handles a trickle 
> feed of new documents (20K a day) and updates easily, but runs on old kit.
> 
> We've built a new cloud using Solr 9.1 (Eclipse Adoptium OpenJDK 17.0.4) and 
> the same topology and settings on new servers and I have found that initial 
> few queries on a node are really bad.  P95 times across the board are 6-8K 
> ms.  It's like the first couple of queries are spinning up a new indexer 
> before settling into memory.  Once in memory (or spun up), queries return in 
> <10ms, but leave it idle for 90 seconds and we get the slow query again.  
> I've noticed on some occasions the admin UI panel can be slow on query if 
> left idle.  Using dev tools in the browser will also show for the first query 
> the page time is also greater than the already high query/elapsed time.
> 
> We do not autowarm caches due to the nature of the queries being used.  
> Garbage collection looks fine on 1200mb allocation.  They are quick and 
> infrequent, usually getting to 70% of 1200mb allocated before being cleaned.
> 
> We see the same behaviour without any active indexing taking place so can 
> rule that out.  The cloud was previously built on version 9 with the same 
> issue.
> 
> 
> The only thing that helps keep queries quick is if I spam requests at the 
> cloud constantly.  I can't compare any idle periods on the 5.4 Solr cloud 
> because there's always some activity.
> 
> Perhaps once this is used in production the activity will keep things alive, 
> but is there something else I can look at to keep the cloud active at all 
> times?
> 
> Thanks.
> 
> Rydym yn croesawu derbyn gohebiaeth yng Nghymraeg. Byddwn yn ateb y fath 
> ohebiaeth yng Nghymraeg ac ni fydd hyn yn arwain at oedi.
> We welcome receiving correspondence in Welsh. We will reply to such 
> correspondence in Welsh and this will not lead to a delay.



Re: 8.11.2 Performance degradation

2022-12-02 Thread Richard Goodman
Hi Charlie,

Gah, thanks for informing me of that, here is a link to the images is here


Cheers,


On Tue, 29 Nov 2022 at 13:23, Charlie Hull 
wrote:

> Hey Richard,
>
> Attachments are stripped by this list so you might want to upload them
> somewhere and link to them.
>
> Cheers
>
> Charlie
>
> On 25/11/2022 17:33, Richard Goodman wrote:
> > Hi there,
> >
> > We have a cluster spread over 72 instances on k8s hosting around 12.5
> > billion documents (made up of 30 collections, each collection having 12
> > shards). We were originally using 7.7.2 and performance was okay enough
> for
> > us for our business needs. We then recently upgraded our cluster to
> > v8.11.2, and have noticed a drop in performance. I appreciate that there
> > have been a lot of changes from 7.7.2 to 8.11.2, but I have been
> collecting
> > metrics, and although the configuration (instance type and resource
> > allocation, start up opts) are the same, we are completely at a loss as
> to
> > why it's performing worse, and was wondering if anyone had any guidance?
> >
> > I recently stumbled across the tickets;
> >
> > - SOLR-15840  -
> > Performance degradation with http2
> > - SOLR-16099  -
> HTTP
> > Client threads can hang
> >
> > In particular which sparked interest, and so we spun up a parallel
> cluster
> > with -Dsolr.http1=true, and there was no difference in performance. We're
> > testing a couple of other ideas, such as different DirectoryFatory *(as I
> > saw a message from someone in the Solr Slack about there being an issue
> > with the MMap directory and vm.max_map_count)*, some GC settings, but are
> > really open to any suggestions. We're also happy if it'll help with any
> > performance related topics to use this cluster to test patches at a large
> > scale to see if it'll help with performance *(more specifically to the
> two
> > Solr tickets listed above)*.
> >
> > I thought it would be useful to show some metrics I collected where we
> had
> > 2 clusters spun up, 1 being 7.7.2 and 1 being 8.11.2 where the 8.11.2
> > cluster was the active, and all traffic was being shadow loaded into the
> > 7.7.2 cluster to compare against. It's important to note that both
> clusters
> > had the same configuration, here is a list to name a few:
> >
> > - G1GC garbage collector
> > - TLOG replication
> > - 27Gi Memory per instance
> > - 16Gi assigned to -XmX and -Xms
> > - 16 cores
> > - -XX:G1HeapRegionSize=4m
> > - -XX:G1ReservePercent=20
> > - -XX:InitiatingHeapOccupancyPercent=35
> >
> > One metric that did stand out, was that 8.11.2 was churning through *a
> lot* of
> > eden space in the heap, which can be seen in some of the screenshots of
> > metrics below;
> >
> > Total Memory Usage:
> > 7.7.2
> >
> >
> > 8.11.2
> >
> >
> > Total Used G1 Pools
> > 7.7.2
> >
> >
> > 8.11.2
> >
> >
> > And finally, the overall thread pool
> > 7.7.2
> >
> >
> > 8.11.2
> >
> >
> > Any guidance or requests to test for performance wise would be
> appreciated.
> >
> > Thanks,
> >
> > Richard
> >
> --
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> Founding member of The Search Network 
> and co-author of Searching the Enterprise
> <
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
>
> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> Amtsgericht Charlottenburg | HRB 230712 B
> Geschäftsführer: John M. Woodell | David E. Pugh
> Finanzamt: Berlin Finanzamt für Körperschaften II



-- 

Richard Goodman (he/him)   |Senior Data Infrastructure engineer

richa...@brandwatch.com


NEW YORK   |   BOSTON   |   CHICAGO   |   TORONTO   |   *BRIGHTON*   |
LONDON   |   COPENHAGEN   |BERLIN   |   STUTTGART   |   FRANKFURT   |
PARIS  |   BUDAPEST   |   SOFIA  |   CHENNAI   |SINGAPORE   |   SYDNEY
|   MELBOURNE


Re: Wired behavior of maxClauseCount restriction since upgrading to solr 9.1

2022-12-02 Thread michael dürr
Hi Jan,

thanks! That helped :-)

On Fri, Dec 2, 2022 at 8:47 AM Jan Høydahl  wrote:

> A plain q=id:(a b c) is parsed into a boolean query with three SHOULD
> clauses, i.e. OR. Try to add &debugQuery=true to a request and see how it
> gets parsed. Then if the limit is 1024 you'll get errors above.
>
> Jan
>
> > 2. des. 2022 kl. 07:43 skrev michael dürr :
> >
> > Thanks to all of you for your advice on using the terms query! I wasn't
> > aware of this syntax until now.
> >
> > Anyways it would be good to know whether I hit a bug or not.
> > Are my example queries probably rewritten to something that has more
> > boolean clauses?
> > If so, why doesn't that apply to the query for the unique key field?
> >
> > Maybe someone can give some insights here?
> >
> > Thanks,
> > Michael
> >
> > On Thu, Dec 1, 2022 at 7:45 PM Kevin Risden  wrote:
> >
> >>
> >>
> https://solr.apache.org/guide/solr/latest/query-guide/other-parsers.html#terms-query-parser
> >>
> >> The "!{terms ..." syntax is short for a query parser. Its a terms query
> >> parser and as Jan said its way more efficient than boolean clauses for a
> >> list of terms.
> >> Kevin Risden
> >>
> >>
> >>
> >> On Thu, Dec 1, 2022 at 1:04 PM Thomas Heigl 
> wrote:
> >>
> >>> Hi Jan,
> >>>
> >>> We ran into the same issue. Terms queries sound like the ideal solution
> >> for
> >>> our use case, but I couldn't find any documentation on the {!terms}
> >> syntax.
> >>> Is there anything in the official docs?
> >>>
> >>> Best,
> >>>
> >>> Thomas
> >>>
> >>> On Thu, Dec 1, 2022 at 2:09 PM Jan Høydahl 
> >> wrote:
> >>>
>  Have you tried using Terms Query? It is much more efficient than many
>  boolean should clauses
> 
>  ?q={!terms f=id}1 2 3 4...1025
> 
>  Jan
> 
> > 1. des. 2022 kl. 13:27 skrev michael dürr :
> >
> > Hi,
> >
> > today we updated solr to version 9.1 (lucene version 9.3)
> > Since then we noticed plenty of TooManyNestedClauses in the logs. Our
> > setting for maxClauseCount is 1024
> > I played around a lot and could trace it down to this:
> >
> > * I built an index from scratch with two fields (id is unique key)
> >> and
> > luceneMatchVersion 9.3:
> >
> >  > multiValued="false" required="true"/>
> >  >>> stored="false"
> > multiValued="false" />
> >
> >  >>> sortMissingLast="true"
> > omitNorms="true" docValues="true" />
> > fieldType name="p_long_dv" class="solr.LongPointField"
> >> docValues="true"
> > omitNorms="true" />
> >
> > As expected this works (the dots(...) represent the complete set of
>  numbers
> > up to 1024):
> >
> > curl -XGET http://localhost:8983/solr/myindex/select?q=+id:(1 2 3
> >> ...
>  1024)
> >
> > And this fails:
> >
> > curl -XGET http://localhost:8983/solr/myindex/select?q=+id:(1 2 3
> >> ...
>  1025)
> >
> > But when I use the other field (categoryId) this fails:
> >
> > curl -XGET
> >> http://localhost:8983/solr/myindex/select?q=+categoryId:(1
> >>> 2
>  3
> > ... 1024)
> >
> > It works until 512 and starts failing from 513 clauses
> >
> > No difference when doing it like this:
> >
> > curl -XGET
> >> http://localhost:8983/solr/myindex/select?q=+(categoryId:1
> > categoryId:2 ... categoryId:1024)
> >
> > Am I misunderstanding the limit maxClauseCount?
> >
> > I'm pretty sure that we did not have any issues with this before.
> >
> > Thanks,
> > Michael
> 
> 
> >>>
> >>
>
>


RE: Very High CPU when indexing

2022-12-02 Thread Matias Laino
Hello Jan, thanks for your reply!

I'm not very experienced with Cache settings on solr, this is the first time 
I'm setting it up myself.

These are the settings I was able to find on our solrconfig.xml









In the meantime, I'll investigate about cachin, thanks again!

MATIAS LAINO | DIRECTOR OF PASSARE REMOTE DEVELOPMENT
matias.la...@passare.com | +54 11-6357-2143


-Original Message-
From: Jan Høydahl  
Sent: Thursday, December 1, 2022 10:11 PM
To: users@solr.apache.org
Subject: Re: Very High CPU when indexing

What are your cache settings? Are you using autoWarmCount or explicit cache 
warming? It could be a source of long commit times.

Jan

> 1. des. 2022 kl. 22:35 skrev Matias Laino :
> 
> 
> I've tried with multiple different autosoft commit and auto commit 
> configurations, and it always takes 2:30 - 3 minutes to get the records 
> available on search, CPU is being pretty good since I upgraded, and memory 
> should be plenty unless I'm mistaken, I'm lost at this point.
> 
> Any help will be really appreciated
> 
> MATIAS LAINO | DIRECTOR OF PASSARE REMOTE DEVELOPMENT 
> matias.la...@passare.com | +54 11-6357-2143
> 
> 
> -Original Message-
> From: Matias Laino 
> Sent: Thursday, December 1, 2022 1:11 PM
> To: users@solr.apache.org
> Subject: RE: Very High CPU when indexing
> 
> Hi Shawn, thanks again for the reply.
> 
> I've tried increasing the memory to 32 gb and 16gb of ram heap with 8 cores, 
> and even though I still see peaks of 300% CPU on the solr process it can 
> handle it (solr doesn't go down).
> But, I've tried several different configurations for the auto commit and soft 
> commit and results always take a few minutes to show up on search, which is 
> really unacceptable for us, I'm not sure how to proceed now.
> 
> I've looked at the cores and for example of the collection I'm testing 
> against right now, I see these values:
> 
> Core 1: 
> Num Docs:4806841
> Max Doc:4845793
> Heap Memory Usage:387392
> Core 2:
> Num Docs:4810159
> Max Doc:4849229
> Heap Memory Usage:450008
> 
> Other collections look fairly similar, except for this one:
> 
> Preview Core1:
> Num Docs:5774937
> Max Doc:5832482
> Heap Memory Usage:407424
> 
> Preview Core2:
> Num Docs:5774937
> Max Doc:5833942
> Heap Memory Usage:463632
> 
> Preview Core 3:
> Num Docs:5778245
> Max Doc:5790174
> Heap Memory Usage:480672
> 
> For some reason, the "Preview Collection" has 3 shards instead of 2 like it 
> was before... maybe that could be related? The collection overview say shards 
> 2 and replication factor 2.
> 
> As additional info, Zookeeper is running on it's own server and solr is the 
> only thing running on that server, aside some system processes.
> 
> Thanks again! 
> 
> MATIAS LAINO | DIRECTOR OF PASSARE REMOTE DEVELOPMENT 
> matias.la...@passare.com | +54 11-6357-2143
> 
> 
> -Original Message-
> From: Shawn Heisey 
> Sent: Thursday, December 1, 2022 1:07 AM
> To: users@solr.apache.org
> Subject: Re: Very High CPU when indexing
> 
> On 11/30/22 08:57, Matias Laino wrote:
>> Q: What is the total document count?
>> A: Based on the dashboard, it's Total #docs: 68.6mn each node (I'm 
>> replicating the same data on both)
> 
> Each core has a count.  And here you can see what I was talking about with 
> max doc compared to num docs.
> 
> https://www.dropbox.com/s/jdgddn4ve5mluhr/core_doc_counts.png?dl=0
> 
>> Q: but it would be great to have an on-disk size and document count 
>> (max docs, not num docs) for each collection
>> A: I'm not sure where to get that from metrics, based on the cloud dashboard 
>> it say the following by shard:
>> preview_s1r2:  1.9Gb
>> preview_s2r11:  1.9Gb
>> preview_s2r6:  1.9Gb
>> staging-d_s1r1:  1.8Gb
>> staging-d_s2r4:  1.8Gb
>> staging-a_s1r1:  1.7Gb
>> staging-a_s2r4:  1.7Gb
>> staging-c_s2r5:  1.6Gb
>> staging-c_s1r2:  1.6Gb
>> pre-prod_s1r1:  1.6Gb
>> pre-prod_s2r4:  1.6Gb
>> staging-b_s1r2:  1.5Gb
>> staging-b_s2r5:  1.5Gb
>> That is replicated on the other node.
> 
> So you've got 22GB of data, and assuming Solr is the only thing running on 
> the machine, only about 8GB of memory to cache it (total RAM of 16GB minus 
> 8GB for the Solr heap).  I would hope for at least of 12GB of cache for that, 
> and more is always better. 8GB may not be enough.  If you have other software 
> running on the machine, it will be even less. Does ZK live on the same 
> instance?  If so, how much heap are you giving to that?
> 
> Performance of a system is often perfectly fine up until some threshold, and 
> once you throw just little bit more data in the mix so it goes over that 
> threshold, performance drops drastically. That is how a small increase can 
> bring a system to its knees.
> 
> If you can upgrade the instance to one with more memory, that might also 
> help, but I do think that the biggest problem is the autoSoftCommit setting.  
> If you really can't make it at least two minutes, which is the value I would 
> use, then set it as high as you can.

RE: Solr 9.1 performance

2022-12-02 Thread Joe Jones (DHCW - Software Development)
No, out of the box 9.1 doesn't include the patch.  Tried adding it in and no 
difference.

I've done some testing running the queries with "distrib=false" and can see the 
query itself runs fine it's just the call to the instance and the response is 
slow.

Something to do with Jetty?

-Original Message-
From: Jan Høydahl 
Sent: 02 December 2022 10:14
To: users@solr.apache.org
Subject: Re: Solr 9.1 performance

WARNING: This email originated from outside of NHS Wales. Do not open links or 
attachments unless you know the content is safe.


Could it be related to 
https://solr.apache.org/news.html#java-17-bug-affecting-solr ? Doubt it as you 
don't use much caching, but hotspot optimization of caches are disabled by 
default in 9.1. You could try to edit bin/solr script to disable the patch and 
see if anything is faster - risking a segfault crash instead :)

Jan

> 2. des. 2022 kl. 10:11 skrev Joe Jones (DHCW - Software Development) 
> :
Rydym yn croesawu derbyn gohebiaeth yng Nghymraeg. Byddwn yn ateb y fath 
ohebiaeth yng Nghymraeg ac ni fydd hyn yn arwain at oedi.
We welcome receiving correspondence in Welsh. We will reply to such 
correspondence in Welsh and this will not lead to a delay.


Re: 8.11.2 Performance degradation

2022-12-02 Thread Alessandro Benedetti
Hi Richard,
when you mention "In particular which sparked interest, and so we spun up a
parallel cluster
with -Dsolr.http1=true, and there was no difference in performance. ", do
you mean that you still see the degradation in performance right?

I will probably state the obvious but normally you would require a detailed
deep investigation to understand your issue.
I suspect that without putting our hands on your
cluster/config/architecture is going to be difficult to give meaningful
suggestions.

Especially with no reference to what you are currently using in Solr,
e.g. do you see the degradation in:
- indexing? indexing how? indexing what? The extent of the degradation
- searching? what kind of queries? faceting? reranking?...

That would definitely help but I suspect it's not going to be an easy one.

Cheers

--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter
 | Youtube
 | Github



On Fri, 2 Dec 2022 at 13:15, Richard Goodman 
wrote:

> Hi Charlie,
>
> Gah, thanks for informing me of that, here is a link to the images is here
> 
>
> Cheers,
>
>
> On Tue, 29 Nov 2022 at 13:23, Charlie Hull <
> ch...@opensourceconnections.com>
> wrote:
>
> > Hey Richard,
> >
> > Attachments are stripped by this list so you might want to upload them
> > somewhere and link to them.
> >
> > Cheers
> >
> > Charlie
> >
> > On 25/11/2022 17:33, Richard Goodman wrote:
> > > Hi there,
> > >
> > > We have a cluster spread over 72 instances on k8s hosting around 12.5
> > > billion documents (made up of 30 collections, each collection having 12
> > > shards). We were originally using 7.7.2 and performance was okay enough
> > for
> > > us for our business needs. We then recently upgraded our cluster to
> > > v8.11.2, and have noticed a drop in performance. I appreciate that
> there
> > > have been a lot of changes from 7.7.2 to 8.11.2, but I have been
> > collecting
> > > metrics, and although the configuration (instance type and resource
> > > allocation, start up opts) are the same, we are completely at a loss as
> > to
> > > why it's performing worse, and was wondering if anyone had any
> guidance?
> > >
> > > I recently stumbled across the tickets;
> > >
> > > - SOLR-15840  -
> > > Performance degradation with http2
> > > - SOLR-16099  -
> > HTTP
> > > Client threads can hang
> > >
> > > In particular which sparked interest, and so we spun up a parallel
> > cluster
> > > with -Dsolr.http1=true, and there was no difference in performance.
> We're
> > > testing a couple of other ideas, such as different DirectoryFatory
> *(as I
> > > saw a message from someone in the Solr Slack about there being an issue
> > > with the MMap directory and vm.max_map_count)*, some GC settings, but
> are
> > > really open to any suggestions. We're also happy if it'll help with any
> > > performance related topics to use this cluster to test patches at a
> large
> > > scale to see if it'll help with performance *(more specifically to the
> > two
> > > Solr tickets listed above)*.
> > >
> > > I thought it would be useful to show some metrics I collected where we
> > had
> > > 2 clusters spun up, 1 being 7.7.2 and 1 being 8.11.2 where the 8.11.2
> > > cluster was the active, and all traffic was being shadow loaded into
> the
> > > 7.7.2 cluster to compare against. It's important to note that both
> > clusters
> > > had the same configuration, here is a list to name a few:
> > >
> > > - G1GC garbage collector
> > > - TLOG replication
> > > - 27Gi Memory per instance
> > > - 16Gi assigned to -XmX and -Xms
> > > - 16 cores
> > > - -XX:G1HeapRegionSize=4m
> > > - -XX:G1ReservePercent=20
> > > - -XX:InitiatingHeapOccupancyPercent=35
> > >
> > > One metric that did stand out, was that 8.11.2 was churning through *a
> > lot* of
> > > eden space in the heap, which can be seen in some of the screenshots of
> > > metrics below;
> > >
> > > Total Memory Usage:
> > > 7.7.2
> > >
> > >
> > > 8.11.2
> > >
> > >
> > > Total Used G1 Pools
> > > 7.7.2
> > >
> > >
> > > 8.11.2
> > >
> > >
> > > And finally, the overall thread pool
> > > 7.7.2
> > >
> > >
> > > 8.11.2
> > >
> > >
> > > Any guidance or requests to test for performance wise would be
> > appreciated.
> > >
> > > Thanks,
> > >
> > > Richard
> > >
> > --
> > Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > Founding member of The Search Network 
> > and co-author of Searching the Enterprise

Re: Solr 9.1 performance

2022-12-02 Thread Jan Høydahl
What I'm saying is that 9.1 includes a workaround for the cache issues, see 
https://github.com/apache/solr/blob/releases/solr/9.1.0/solr/bin/solr#L2246-L2250
You may want to try to disable this workaround to see if it helps with the 
performance of your system. Alternatively try with JDK11, which does not 
trigger the workaround.

But it is just a blind shot, your issues may stem from something else, and we'd 
need much more details on your setup, config, physical RAM, heap etc.

I would like to question the decision of running 4 solr nodes on the same 
server. Have you tried instead to run one solr process per server, keeping 12 
shards and 2 replicas?
If you enable affinity placement plugin and tag each node with data-center id 
and hostname, then solr will place the shards/replicas evenly across all 6 
servers.

Finally, add some observability to your cluster to learn what is actually going 
on. You can e.g. use Datadog 
 or another cloud 
provider to quickly get started. It will help you discover what is happening in 
your cluster.

PS: Have you disabled all Antivirus software? Made sure your heap size is as 
low as possible? Verified that your system is not swapping?

Jan

> 2. des. 2022 kl. 17:25 skrev Joe Jones (DHCW - Software Development) 
> :
> 
> No, out of the box 9.1 doesn't include the patch.  Tried adding it in and no 
> difference.
> 
> I've done some testing running the queries with "distrib=false" and can see 
> the query itself runs fine it's just the call to the instance and the 
> response is slow.
> 
> Something to do with Jetty?
> 
> -Original Message-
> From: Jan Høydahl 
> Sent: 02 December 2022 10:14
> To: users@solr.apache.org
> Subject: Re: Solr 9.1 performance
> 
> WARNING: This email originated from outside of NHS Wales. Do not open links 
> or attachments unless you know the content is safe.
> 
> 
> Could it be related to 
> https://solr.apache.org/news.html#java-17-bug-affecting-solr ? Doubt it as 
> you don't use much caching, but hotspot optimization of caches are disabled 
> by default in 9.1. You could try to edit bin/solr script to disable the patch 
> and see if anything is faster - risking a segfault crash instead :)
> 
> Jan
> 
>> 2. des. 2022 kl. 10:11 skrev Joe Jones (DHCW - Software Development) 
>> :
> Rydym yn croesawu derbyn gohebiaeth yng Nghymraeg. Byddwn yn ateb y fath 
> ohebiaeth yng Nghymraeg ac ni fydd hyn yn arwain at oedi.
> We welcome receiving correspondence in Welsh. We will reply to such 
> correspondence in Welsh and this will not lead to a delay.



Re: solr and dovecot: high load

2022-12-02 Thread Shawn Heisey

On 12/1/22 02:47, alessia.gagli...@qboxmail.it.INVALID wrote:
We have another solr (that handles less users) that had a high I/O 
activity, which was reduced after increasing the physical memory from 
24GB to 48GB. Could this be a scalable solution?


More memory not consumed by programs will often greatly increase general 
performance, and Solr in particular relies on the disk cache.  Add 
enough memory and there will be almost zero I/O for queries, most of the 
I/O will be for index updates.


https://cwiki.apache.org/confluence/display/solr/solrperformanceproblems

Thanks,
Shawn



Re: SOLR adding ,​ to strings erroneously

2022-12-02 Thread Shawn Heisey

On 12/1/22 05:41, Eric Pugh wrote:

Shawn,

Have we received a couple of mentions of this?  Or am I misremembering?  Do we 
need to open a JIRA and change how logging.js works?


This is the first mention of this that I can recall seeing, apparently I 
missed Dmitri's issue.


I'm curious as to why those entities are displaying as text instead of 
being interpreted by the browser as a zero-width space.


Thanks,
Shawn