25 jan 2007 kl. 20.43 skrev Ryan O'Hara:

Is there anyway to sort the suggestions prior, so that grabbing only one suggestion would give you the best suggestion, in this case "genetics"?

Without having looked at the code for a long time, I think the problem is what the lucene scoring consider to be best. First the grams are searched, resulting in a number of hits. Then the edit- distance is calculated on each hit. "Genetics" is appearently the third most similar hit according to Lucene, but the best according to Levenshtein.

I.e. Lucene does not use edit-distance as similarity. You need to get a bunch of best hits in order to find the one with the smallest edit- distance.


Hope this helps.

--
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to