[
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Da Huang updated LUCENE-4396:
-----------------------------
Attachment: LUCENE-4396.patch
This patch is based on git mirror commit
ce7d0578b30981d15687bf76aec595274efccbad
In this patch, I just compact the array as I go through the MUST_NOT docs.
{code}
TaskQPS baseline StdDevQPS my_version StdDev
Pct diff
HighAndTonsLowNot 4.88 (3.5%) 2.44 (4.4%)
-49.9% ( -55% - -43%)
HighAndSomeLowNot 6.55 (6.1%) 3.60 (4.7%)
-45.1% ( -52% - -36%)
HighAndSomeLowOr 9.93 (12.9%) 5.49 (4.7%)
-44.7% ( -55% - -31%)
LowAndSomeLowNot 293.78 (2.3%) 216.29 (1.7%)
-26.4% ( -29% - -22%)
LowAndSomeLowOr 347.60 (1.8%) 266.94 (1.2%)
-23.2% ( -25% - -20%)
HighAndTonsLowOr 5.59 (5.7%) 4.34 (4.4%)
-22.4% ( -30% - -13%)
PKLookup 97.38 (2.1%) 95.54 (2.9%)
-1.9% ( -6% - 3%)
HighAndSomeHighNot 1.88 (2.2%) 1.89 (6.6%)
0.7% ( -7% - 9%)
LowAndSomeHighOr 41.32 (2.9%) 41.92 (2.8%)
1.5% ( -4% - 7%)
LowAndSomeHighNot 54.74 (2.4%) 56.73 (2.7%)
3.7% ( -1% - 8%)
HighAndSomeHighOr 2.26 (2.7%) 2.56 (6.8%)
13.3% ( 3% - 23%)
LowAndTonsLowNot 17.18 (1.2%) 22.14 (2.4%)
28.9% ( 24% - 32%)
LowAndTonsHighOr 1.21 (2.7%) 1.57 (4.4%)
29.8% ( 22% - 37%)
LowAndTonsLowOr 17.22 (1.3%) 22.53 (2.4%)
30.9% ( 26% - 35%)
HighAndTonsHighOr 0.07 (1.2%) 0.16 (13.1%)
141.0% ( 125% - 157%)
LowAndTonsHighNot 2.02 (2.4%) 9.70 (9.7%)
380.6% ( 360% - 402%)
HighAndTonsHighNot 0.09 (1.2%) 0.50 (23.1%)
475.7% ( 446% - 505%)
{code}
Besides, I am working combine all explored method to get a better perf now.
In order to get more accurate perf of each method, I'm retesting some previous
methods now.
It may take several days to make a combined method work.
> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> SIZE.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch,
> stat.cpp
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared
> to the other clauses, that BooleanScorer would perform better than
> BooleanScorer2. BooleanScorer still has some vestiges from when it used to
> handle MUST so it shouldn't be hard to bring back this capability ... I think
> the challenging part might be the heuristics on when to use which (likely we
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you
> are inspired!
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]