On 08/02/2013 10:16 AM, Ankit Murarka wrote: > is it possible to implement Complete Phrase Suggest Feature in Lucene > 4.3 . So if I enter an incorrect phrase it can suggest me few possible > valid phrases. > > One way could be to get suggestion for each word in the sentence and > calling SpellChecker.suggestSimilar for each word. This can be done > but this won't help me build a near possible phrase. > > If I input "Wanna chk Luc Fetre" then I will get different spell > suggestions for each word but this wont help me build a near exact > phrase.
I did something similar some time ago (I've used Lucene 4.0 trunk before its release, and I don't know if spellchecker API changed since then). Idea is simple: - Take a list of valid phrases and index whole phrases as spellchecker suggestions. My implementation: - As a list of valid phrases I took queries from search engine query log. - At index time, beside saving phrases, I also saved occurance number of single phrases. - My phrase suggestion would take 5 most similar phrases to given query and returned most common phrase from index. It's very simple and works quite well. A few tips: - Think when to show phrase suggestion, e.g. show suggestion only if most common suggested phrase occures 10 time more often than given query. - Explore different distance measures and their parameters. - Maybe it would be good to use only word 3-grams as phrases (if you have query "how to use lucene", you would index "how to use" and "to use lucene" as phrases) -- than you would "fix" given query by parts. - To explore more solutions of this problem search papers for "related query suggestion". - Twitter came to similar idea as I did: https://blog.twitter.com/2012/related-queries-and-spelling-corrections-search Regards, Ivan Krišto <https://blog.twitter.com/2012/related-queries-and-spelling-corrections-search>