[ 
https://issues.apache.org/jira/browse/LUCENE-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6889:
---------------------------------
    Attachment: LUCENE-6889.patch

Here is a patch that does the following rewrites:

Removal of FILTER clauses that are also MUST clauses
{noformat}
#a +a -> +a
{noformat}

FilteredQuery rewrite when the query is a MatchAllDocsQuery
{noformat}
+*:*^b #f -> ConstantScoreQuery(f)^b
{noformat}

Removal of filters on MatchAllDocsQuery if they are a MUST clause as well
{noformat}
+a #*:* -> +a
{noformat}

Deduplication of FILTER and MUST_NOT clauses
{noformat}
+a #f #f -f -f -> +a #f -f
{noformat}

They have the nice property of being able to execute things that we used to 
execute as a disjunction or a conjunction as a simple term query.

I also wanted to rewrite queries to a MatchAllDocsQuery when there was an 
intersection between required and prohibited clauses (Terry's rule 3) or when 
the mininumShouldMatch is greater than the number of SHOULD clauses but this 
broke weight normalization. We can probably solve the MUST_NOT/MUST 
intersection at the Scorer level but I propose to defer it to another issue.

The patch includes unit tests for the above rewrite rules as well as a random 
test that makes sure that the same set of matches and scores are produced if no 
rewriting is performed.

> BooleanQuery.rewrite could easily optimize some simple cases
> ------------------------------------------------------------
>
>                 Key: LUCENE-6889
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6889
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6889.patch
>
>
> Follow-up of SOLR-8251: APIs and user interfaces sometimes encourage to write 
> BooleanQuery instances that are not optimal, for instance a typical case that 
> happens often with Solr/Elasticsearch is to send a request that has a 
> MatchAllDocsQuery as a query and some filter, which could be executed more 
> efficiently by directly wrapping the filter into a ConstantScoreQuery.
> Here are some ideas of rewrite operations that BooleanQuery could perform:
>  - remove FILTER clauses when they are also a MUST clause
>  - rewrite queries of the form "+*:* #filter" to a ConstantScoreQuery(filter)
>  - rewrite to a MatchNoDocsQuery when a clause that is a MUST or FILTER 
> clause is also a MUST_NOT clause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to