[ 
https://issues.apache.org/jira/browse/LUCENE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-8083:
---------------------------------
    Attachment: LUCENE-8083.patch

Here is a patch that improves BM25's maxScore by taking the maxFreq into 
account, and implements maxScore on all SimilarityBase impls by passing 
freq=maxFreq and docLen=1 to the score method. I also added new tests that are 
specific to this maxScore method.

Practically, this means that the LUCENE-4100 optimizations now work well with 
similarities whose score saturates quickly with increasing frequencies like all 
DFR similarities, IBSimilarity with DistributionSPL, AxiomaticF2EXP and 
AxiomaticF2LOG. It might work well with other similarities as well in the 
future if we start recording the per-term (or maybe per-field would be a good 
start) maximum term frequency.

> Give similarities better values for maxScore
> --------------------------------------------
>
>                 Key: LUCENE-8083
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8083
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-8083.patch
>
>
> The benefits of LUCENE-4100 largely depend on the quality of the upper bound 
> of the scores that is provided by the similarity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to