[ 
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891680#action_12891680
 ] 

Robert Muir commented on LUCENE-2557:
-------------------------------------

bq. I agree that fuzzy is to find misspellings, but I don't think it should 
favour misspellings above an exact match.

So what is the problem with TopTermsBoostOnlyBooleanQueryRewrite? it will never 
do this.

While I agree this is a really simple solution to the problem, It seemed to me 
from the comments in LUCENE-329 that there were differing opinions on how one 
might want to combine the factors of edit distance boost, tf, idf, etc... it 
seems it will depend on the application.

So I definitely don't think this is any bug in fuzzyquery. Personally, I am not 
against adding new alternative rewrite methods like the one you added here, so 
that more choices are available. But this just seems to be the same issue as 
LUCENE-329 to me.

My personal preference would be to take this code and bring LUCENE-329 up to 
speed, e.g. creating an alternative in contrib/queries or something that uses 
Mark Harwoods "smart fuzzy" logic which is currently limited to FuzzyLikeThis.


> FuzzyQuery - fuzzy terms and misspellings are ranked higher than exact matches
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-2557
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2557
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 3.0.2
>            Reporter: Jingkei Ly
>         Attachments: idf-scoring-test-case.patch, LUCENE-2557.patch
>
>
> The FuzzyQuery often causes misspellings to be ranked higher than the exact 
> match, which seems to be an undesirable property generally. 
> For example, in an index of surnames, if I search using a FuzzyQuery for 
> "smith", the misspellings such as "smiith", or "smiht" would appear near the 
> top of the search results ahead of documents that match "smith".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to