Thanks everyone for the tips!
@Mikhail
I tried your suggestion and it seems to work. We do not run into the cold
boot long query time for the particular facet query anymore. Had a
follow-up question: Will adding three or four-facet queries have
performance implications for the searcher? I checked the memory usage and
there was no spike from this particular autowarm query. Are
docValus generally memory-hungry ? I would think they would only have a
smaller memory footprint compared to the segments themselves.

Thanks
Arun



On Wed, Feb 14, 2024 at 12:52 PM Arun Sudhir <arunsud...@gmail.com> wrote:

> @Rahul Surendran <rahul.surend...@tcs.com> facet indeed is where the time
> is spent:
>
> "process": {
>           "time": 1560,
>           "query": {
>             "time": 10
>           },
>           "facet": {
>             "time": 1545
>           },
>
> On Tue, Feb 13, 2024, 6:03 AM Rahul Goswami <rahul196...@gmail.com> wrote:
>
>> Can you pass debug=true with your query to find out which phase  (query or
>> faceting) takes more time? This is to eliminate chasing the wrong symptom
>> to optimize for.
>>
>> -Rahul
>>
>> On Tue, Feb 13, 2024 at 3:48 AM Mikhail Khludnev <m...@apache.org> wrote:
>>
>> > Hello, Arun.
>> > Why don't you warm a new searcher with a query listener?
>> >
>> > On Tue, Feb 13, 2024 at 3:18 AM Arun Sudhir <arunsud...@gmail.com>
>> wrote:
>> >
>> > > Hello,
>> > > We use solr for our search needs and we have documents indexed on a
>> core
>> > in
>> > > multiple machines. Over time, the index on some machines has grown
>> from
>> > 30
>> > > GB to 60 GB now to a giant 133 GB. While others are still hovering
>> around
>> > > 80GB, and some others are still under 30GB. We manually control which
>> > > documents go into which machine and do not use SolrCloud.
>> > >
>> > > We have a field in our index which is a docValue. What we have
>> noticed is
>> > > that facet queries on this field take around 10 seconds for almost the
>> > > first call every minute or so on the huge server machines which have
>> ~130
>> > > GB index size. We commit every minute on our servers as well. We have
>> > > ensured that the machines do not starve on RAM and for the ones which
>> > have
>> > > 130 GB of index, we have 256 GB of RAM. So the segments are all in
>> memory
>> > > all the time.
>> > >
>> > > Still, we see every call made after a minute or so takes 10 seconds on
>> > the
>> > > big shards with index size close to 130 GB, 6 seconds on the shards
>> that
>> > > are 80GB, and less than 4 seconds on the normal shards whose size is
>> less
>> > > than 30 GB.
>> > >
>> > > How can we optimize and get rid of this latency? We have tried using
>> > > DocValuesFormat=Direct, increasing the number of facet.threads,
>> > increasing
>> > > the heap size etc. Is there anything else we can do to get the
>> > > performance of facet queries on the large shards to under 2 seconds?
>> > >
>> > >
>> > > Thanks
>> > > Arun
>> > >
>> >
>> >
>> > --
>> > Sincerely yours
>> > Mikhail Khludnev
>> >
>>
>

-- 
arunsud...@google.com

Reply via email to