Hi Richard,
when you mention "In particular which sparked interest, and so we spun up a
parallel cluster
with -Dsolr.http1=true, and there was no difference in performance. ", do
you mean that you still see the degradation in performance right?

I will probably state the obvious but normally you would require a detailed
deep investigation to understand your issue.
I suspect that without putting our hands on your
cluster/config/architecture is going to be difficult to give meaningful
suggestions.

Especially with no reference to what you are currently using in Solr,
e.g. do you see the degradation in:
- indexing? indexing how? indexing what? The extent of the degradation
- searching? what kind of queries? faceting? reranking?...

That would definitely help but I suspect it's not going to be an easy one.

Cheers

--------------------------
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io <http://sease.io/>
LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
<https://twitter.com/seaseltd> | Youtube
<https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
<https://github.com/seaseltd>


On Fri, 2 Dec 2022 at 13:15, Richard Goodman <richa...@brandwatch.com>
wrote:

> Hi Charlie,
>
> Gah, thanks for informing me of that, here is a link to the images is here
> <https://imgur.com/a/yEmBGuv>
>
> Cheers,
>
>
> On Tue, 29 Nov 2022 at 13:23, Charlie Hull <
> ch...@opensourceconnections.com>
> wrote:
>
> > Hey Richard,
> >
> > Attachments are stripped by this list so you might want to upload them
> > somewhere and link to them.
> >
> > Cheers
> >
> > Charlie
> >
> > On 25/11/2022 17:33, Richard Goodman wrote:
> > > Hi there,
> > >
> > > We have a cluster spread over 72 instances on k8s hosting around 12.5
> > > billion documents (made up of 30 collections, each collection having 12
> > > shards). We were originally using 7.7.2 and performance was okay enough
> > for
> > > us for our business needs. We then recently upgraded our cluster to
> > > v8.11.2, and have noticed a drop in performance. I appreciate that
> there
> > > have been a lot of changes from 7.7.2 to 8.11.2, but I have been
> > collecting
> > > metrics, and although the configuration (instance type and resource
> > > allocation, start up opts) are the same, we are completely at a loss as
> > to
> > > why it's performing worse, and was wondering if anyone had any
> guidance?
> > >
> > > I recently stumbled across the tickets;
> > >
> > >     - SOLR-15840<https://issues.apache.org/jira/browse/SOLR-15840>  -
> > >     Performance degradation with http2
> > >     - SOLR-16099<https://issues.apache.org/jira/browse/SOLR-16099>  -
> > HTTP
> > >     Client threads can hang
> > >
> > > In particular which sparked interest, and so we spun up a parallel
> > cluster
> > > with -Dsolr.http1=true, and there was no difference in performance.
> We're
> > > testing a couple of other ideas, such as different DirectoryFatory
> *(as I
> > > saw a message from someone in the Solr Slack about there being an issue
> > > with the MMap directory and vm.max_map_count)*, some GC settings, but
> are
> > > really open to any suggestions. We're also happy if it'll help with any
> > > performance related topics to use this cluster to test patches at a
> large
> > > scale to see if it'll help with performance *(more specifically to the
> > two
> > > Solr tickets listed above)*.
> > >
> > > I thought it would be useful to show some metrics I collected where we
> > had
> > > 2 clusters spun up, 1 being 7.7.2 and 1 being 8.11.2 where the 8.11.2
> > > cluster was the active, and all traffic was being shadow loaded into
> the
> > > 7.7.2 cluster to compare against. It's important to note that both
> > clusters
> > > had the same configuration, here is a list to name a few:
> > >
> > >     - G1GC garbage collector
> > >     - TLOG replication
> > >     - 27Gi Memory per instance
> > >     - 16Gi assigned to -XmX and -Xms
> > >     - 16 cores
> > >     - -XX:G1HeapRegionSize=4m
> > >     - -XX:G1ReservePercent=20
> > >     - -XX:InitiatingHeapOccupancyPercent=35
> > >
> > > One metric that did stand out, was that 8.11.2 was churning through *a
> > lot* of
> > > eden space in the heap, which can be seen in some of the screenshots of
> > > metrics below;
> > >
> > > Total Memory Usage:
> > > 7.7.2
> > >
> > >
> > > 8.11.2
> > >
> > >
> > > Total Used G1 Pools
> > > 7.7.2
> > >
> > >
> > > 8.11.2
> > >
> > >
> > > And finally, the overall thread pool
> > > 7.7.2
> > >
> > >
> > > 8.11.2
> > >
> > >
> > > Any guidance or requests to test for performance wise would be
> > appreciated.
> > >
> > > Thanks,
> > >
> > > Richard
> > >
> > --
> > Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > Founding member of The Search Network <http://www.thesearchnetwork.com>
> > and co-author of Searching the Enterprise
> > <
> >
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> > >
> > tel/fax: +44 (0)8700 118334
> > mobile: +44 (0)7767 825828
> >
> > OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> > Amtsgericht Charlottenburg | HRB 230712 B
> > Geschäftsführer: John M. Woodell | David E. Pugh
> > Finanzamt: Berlin Finanzamt für Körperschaften II
>
>
>
> --
>
> Richard Goodman (he/him)   |    Senior Data Infrastructure engineer
>
> richa...@brandwatch.com
>
>
> NEW YORK   |   BOSTON   |   CHICAGO   |   TORONTO   |   *BRIGHTON*   |
> LONDON   |   COPENHAGEN   |    BERLIN   |   STUTTGART   |   FRANKFURT   |
> PARIS  |   BUDAPEST   |   SOFIA  |   CHENNAI   |    SINGAPORE   |   SYDNEY
> |   MELBOURNE
>

Reply via email to