I see, this is a 50kB allocation per segment, which is fine under normal usage, but becomes noticeable with percolator queries which create a new MaxScoreBulkScorer for every document?
In general, bulk scorers will want to allocate large arrays/bit sets to help with bulk processing of documents, some other bulk scorers do this as well: BatchScoreBulkScorer, BlockMaxConjunctionBulkScorer, DenseConjunctionBulkScorer, DisjunctionMaxBulkScorer. I wonder if a better fix would be to disable bulk scoring for percolator/monitor-style usage and force doc-at-a-time evaluation by using ScorerSupplier#get() (possibly wrapped in a DefaultBulkScorer if you'd like to consume hits via the BulkScorer API while still doing doc-at-a-time evaluation) instead of ScorerSupplier#bulkScorer(). On Wed, Jul 1, 2026 at 12:40 PM Alan Woodward <[email protected]> wrote: > Hi all, > > We’ve found a regression in 10.5.0 due to eager allocation of large array > buffers in MaxScoreBulkScorer - fix proposed here: > https://github.com/apache/lucene/pull/16316 > > This particularly hits boolean queries with an expensive two-phase > subclause (in our case, some percolator queries got a lot slower). I think > it probably warrants a 10.5.1 bugfix. > > - Alan > -- Adrien
