Thank you all for the suggestions.

I will try to profile and find the bottleneck.

I am getting the following exception which I understand may be due to the
multiterm field expansion for the wildcard query. Please correct me if I am
wrong.
*The request took too long to iterate over doc values.*

The WordDelimiterGraphFilter is not used but WordDelimiterFilter is used
and following is its configuration. As WordDelimiterFilter is deprecated we
will remove it in the next step.
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="1" splitOnCaseChange="1" preserveOriginal="1"/>

The wildcard queries are executed against the text data and yes there are a
huge number of possible expansions of the wildcard query.
All the 12 shards are on a single machine with 521 GB memory and each shard
is started with SOLR_JAVA_MEM="-Xmx30g". So the 521 GB memory is shared by
all the 12 shards.

Following is the cache configuration.
<filterCache class="solr.FastLRUCache" size="128" initialSize="128"
autowarmCount="0"/>
<queryResultCache class="solr.LRUCache" size="128" initialSize="128"
autowarmCount="0"/>
<documentCache class="solr.LRUCache" size="128" initialSize="128"
autowarmCount="0"/>

Thanks,
Modassar





On Sat, Mar 26, 2022 at 9:43 PM Shawn Heisey <apa...@elyograg.org> wrote:

> On 3/26/2022 6:24 AM, Mike Drob wrote:
> > Can you provide more details on what they CPU time is spent on? Maybe
> look
> > at some JFR profiles or collect several jstacks to see where they
> > bottlenecks are.
> >
> > On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <modather1...@gmail.com>
> > wrote:
> >
> >> We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are
> the
> >> details of Solr installation.
> The output from "vmstat 5 -w" run for several minutes while the problem
> is happening will also help pinpoint bottlenecks.   This is best run in
> a terminal that is very wide -- say 132 columns or more.  I'm not saying
> Mike is wrong, just giving you another data point you can look at and
> share.
>
> Wildcard queries tend to be VERY inefficient unless they are executed on
> fields with very low cardinality.  I bet you're running them on the
> highest cardinality fields you have, fields which probably have millions
> or billions of unique tokens.
>
> What I suspect here is that you are running on the very edge of
> "insufficient system memory for disk caching" ... to the point where
> 6.5.1 was just barely able to handle things well, but changes since then
> have shifted things a little bit and now you're over the line into
> performance problems.
>
> Is a full copy of all 12 shards (totaling 4TB) resident on each machine,
> with 512GB memory?  If so, you probably need more memory installed in
> each server.
>
> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems
>
> (I wrote that wiki page, so if there are errors they are mine)
>
> Thanks,
> Shawn
>
>

Reply via email to