Jim Ferenczi created LUCENE-8840:
------------------------------------
Summary: TopTermsBlendedFreqScoringRewrite should use SynonymQuery
Key: LUCENE-8840
URL: https://issues.apache.org/jira/browse/LUCENE-8840
Project: Lucene - Core
Issue Type: Improvement
Reporter: Jim Ferenczi
Today the TopTermsBlendedFreqScoringRewrite, which is the default rewrite
method for Fuzzy queries, uses the BlendedTermQuery to score documents that
match the fuzzy terms. This query blends the frequencies used for scoring
across the terms and creates a disjunction of all the blended terms. This means
that each fuzzy term that match in a document will add their BM25 score
contribution. We already have a query that can blend the statistics of multiple
terms in a single scorer that sums the doc frequencies rather than the entire
BM25 score: the SynonymQuery. Since
https://issues.apache.org/jira/browse/LUCENE-8652 this query also handles boost
between 0 and 1 so it should be easy to change the default rewrite method for
Fuzzy queries to use it instead of the BlendedTermQuery. This would bound the
contribution of each term to the final score which seems a better alternative
in terms of relevancy than the current solution.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]