[
https://issues.apache.org/jira/browse/LUCENE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-8083:
---------------------------------
Attachment: LUCENE-8083.patch
Here is a patch that improves BM25's maxScore by taking the maxFreq into
account, and implements maxScore on all SimilarityBase impls by passing
freq=maxFreq and docLen=1 to the score method. I also added new tests that are
specific to this maxScore method.
Practically, this means that the LUCENE-4100 optimizations now work well with
similarities whose score saturates quickly with increasing frequencies like all
DFR similarities, IBSimilarity with DistributionSPL, AxiomaticF2EXP and
AxiomaticF2LOG. It might work well with other similarities as well in the
future if we start recording the per-term (or maybe per-field would be a good
start) maximum term frequency.
> Give similarities better values for maxScore
> --------------------------------------------
>
> Key: LUCENE-8083
> URL: https://issues.apache.org/jira/browse/LUCENE-8083
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-8083.patch
>
>
> The benefits of LUCENE-4100 largely depend on the quality of the upper bound
> of the scores that is provided by the similarity.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]