Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

2024-09-17 Thread Rui Wu
Another information is that, in Lucene97, this query (12 SHOULD clauses) collected 1001 results; while in Lucene911, this query (12 SHOULD clauses) collected all docs (3.6M collect count). In Lucene911, if the query has only one SHOULD clause, it collects 1001 results. If the query has multiple cl

Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

2024-09-17 Thread Rui Wu
This query latency increased from 14.65 to 20.90ms. We use the `TopScoreDocCollector.createSharedManager(/*batchSize*/ 101, /*searchAfterFieldDoc*/ null, /*hitsThreshold*/ 1000); ` Thanks a lot! On Tue, Sep 17, 2024 at 6:45 AM Adrien Grand wrote: > Can you tell us how long this query used to t

Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

2024-09-17 Thread Adrien Grand
Can you tell us how long this query used to take, and how long it takes now? Also are you using IndexSearcher's default total hit count threshold of 1,000, or are you passing a custom value to TopScoreDocCollectorManager? On Tue, Sep 17, 2024 at 10:14 AM Rui Wu wrote: > Hi Adrien, > > Thanks for

Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

2024-09-17 Thread Rui Wu
Hi Adrien, Thanks for looking into this! Here are more screenshots of the flamegraph. The original flamegraph HTMLs have stack traces from our app so I don't share it here. [image: Screenshot 2024-09-17 at 1.13.07 AM.png][image: Screenshot 2024-09-17 at 1.12.01 AM.png] On Tue, Sep 17, 2024 at 1:0

Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

2024-09-17 Thread Adrien Grand
Hello Rui, We actually released a change that should make MaxScoreBulkScorer faster on dense disjunctions in 9.8: https://github.com/apache/lucene/pull/12444. Your benchmark case is quite specific though as all clauses match all docs and produce constant scores, so I would expect the scorer to qui