[ 
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892344#action_12892344
 ] 

Mark Harwood commented on LUCENE-2557:
--------------------------------------

bq. Fixing all expansions to IDF(QT) would remove dynamics of the score, making 
the contribution to the score for all expansions identical. 

The "boost" property is used by fuzzy/synonyms etc to express the preference 
for one term variant over another. The effects of this boost setting are 
demonstrably wiped out when unfiltered IDF of term variants is used (see the 
attached Junit)

bq. , why not simply preserving expansion Term IDF,

See above. The objective is for all variants in an expanded query to share the 
same IDF setting in order for the boost setting to work as required.

> FuzzyQuery - fuzzy terms and misspellings are ranked higher than exact matches
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-2557
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2557
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 3.0.2
>            Reporter: Jingkei Ly
>         Attachments: idf-scoring-test-case.patch, LUCENE-2557.patch
>
>
> The FuzzyQuery often causes misspellings to be ranked higher than the exact 
> match, which seems to be an undesirable property generally. 
> For example, in an index of surnames, if I search using a FuzzyQuery for 
> "smith", the misspellings such as "smiith", or "smiht" would appear near the 
> top of the search results ahead of documents that match "smith".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to