[jira] [Updated] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Da Huang (JIRA) Mon, 14 Jul 2014 23:04:31 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Da Huang updated LUCENE-4396:
-----------------------------

    Attachment: LUCENE-4396.patch

This patch is based on git mirror commit 
ce7d0578b30981d15687bf76aec595274efccbad

In this patch, I just compact the array as I go through the MUST_NOT docs.
{code}
                    TaskQPS baseline      StdDevQPS my_version      StdDev      
          Pct diff
       HighAndTonsLowNot        4.88      (3.5%)        2.44      (4.4%)  
-49.9% ( -55% -  -43%)
       HighAndSomeLowNot        6.55      (6.1%)        3.60      (4.7%)  
-45.1% ( -52% -  -36%)
        HighAndSomeLowOr        9.93     (12.9%)        5.49      (4.7%)  
-44.7% ( -55% -  -31%)
        LowAndSomeLowNot      293.78      (2.3%)      216.29      (1.7%)  
-26.4% ( -29% -  -22%)
         LowAndSomeLowOr      347.60      (1.8%)      266.94      (1.2%)  
-23.2% ( -25% -  -20%)
        HighAndTonsLowOr        5.59      (5.7%)        4.34      (4.4%)  
-22.4% ( -30% -  -13%)
                PKLookup       97.38      (2.1%)       95.54      (2.9%)   
-1.9% (  -6% -    3%) 
      HighAndSomeHighNot        1.88      (2.2%)        1.89      (6.6%)    
0.7% (  -7% -    9%) 
        LowAndSomeHighOr       41.32      (2.9%)       41.92      (2.8%)    
1.5% (  -4% -    7%) 
       LowAndSomeHighNot       54.74      (2.4%)       56.73      (2.7%)    
3.7% (  -1% -    8%) 
       HighAndSomeHighOr        2.26      (2.7%)        2.56      (6.8%)   
13.3% (   3% -   23%)
        LowAndTonsLowNot       17.18      (1.2%)       22.14      (2.4%)   
28.9% (  24% -   32%)
        LowAndTonsHighOr        1.21      (2.7%)        1.57      (4.4%)   
29.8% (  22% -   37%)
         LowAndTonsLowOr       17.22      (1.3%)       22.53      (2.4%)   
30.9% (  26% -   35%)
       HighAndTonsHighOr        0.07      (1.2%)        0.16     (13.1%)  
141.0% ( 125% -  157%)
       LowAndTonsHighNot        2.02      (2.4%)        9.70      (9.7%)  
380.6% ( 360% -  402%)
      HighAndTonsHighNot        0.09      (1.2%)        0.50     (23.1%)  
475.7% ( 446% -  505%)
{code}

Besides, I am working combine all explored method to get a better perf now. 
In order to get more accurate perf of each method, I'm retesting some previous 
methods now. 
It may take several days to make a combined method work.

> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
>                 Key: LUCENE-4396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> SIZE.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, 
> stat.cpp
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Reply via email to