[
https://issues.apache.org/jira/browse/LUCENE-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-5554:
--------------------------------
Attachment: LUCENE-5554.patch
I debugged this. look at the old Scorer.java (say from 4.7) and it has two
methods:
{code}
/* scores all docs (whole segment) until exhausted. */
score(Collector collector)
/* scores range (typically small range of just 2K or whatever BS1 does). More
complicated logic than the first one (range checks, must return 'more', etc). */
score(Collector collector, int max, int firstDocID)
{code}
In LUCENE-5487 these got combined into a single, more complicated method.
Actually the API is fine: we just have to split out these two very different
cases in the default implementation, and its fixed for all scorers.
here's a patch that does that, and restores the performance. Mike's patch just
dodges this for his benchmark, but the issue here is more general. However,
before committing we should probably see if we can clean up the patch a bit.
> Add TermBulkScorer
> ------------------
>
> Key: LUCENE-5554
> URL: https://issues.apache.org/jira/browse/LUCENE-5554
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5554.patch, LUCENE-5554.patch
>
>
> Hotspot was unhappy with the changes in LUCENE-5487, e.g.:
> http://people.apache.org/~mikemccand/lucenebench/OrHighHigh.html
> But it looks like we can get the performance back by making a dedicated
> BulkScorer for TermQuery.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]