[ 
https://issues.apache.org/jira/browse/LUCENE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6695:
---------------------------------
    Attachment: LUCENE-6695.patch

Here is a patch: it computes the aggregated doc freq from several terms as the 
maximum doc freq, and the total term freq as the sum of the total term freqs of 
individual terms.

I put the query in lucene/core so that TopTermsBlendedFreqScoringRewrite could 
reuse it and marked it as experimental, but if someone is not comfortable with 
it I can revert the changes to TopTermsBlendedFreqScoringRewrite and move this 
query to the sandbox.

> BlendedTermQuery
> ----------------
>
>                 Key: LUCENE-6695
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6695
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6695.patch
>
>
> It is sometimes desirable to ignore differences between index statistics of 
> several terms so that they produce the same scores, for instance if you 
> resolve synonyms at search time or if you want to search across several 
> fields. Elasticsearch has been using this approach for its multi_match query 
> for some time now.
> We already blend statistics in TopTermsBlendedFreqScoringRewrite (used by 
> FuzzyQuery) but it could be helpful to have a dedicated query to choose 
> manually which terms to blend stats from.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to