Sounds interesting Martin! Is the dictionary static, or is it generated from the corpus or from user queries?
-Yonik On 3/20/07, Martin Haye <[EMAIL PROTECTED]> wrote:
As part of XTF, an open source publishing engine that uses Lucene, I developed a new spelling correction engine specifically to provide "Did you mean..." links for misspelled queries. I and a small group are preparing this for submission as a contrib module to Lucene. And we're inviting interested people to join the discussion about it. The new engine is being called "Spelt" and differs from the one currently in Lucene contrib in the following ways: - More accurate: Much better performance on single-word queries (90% correct in #1 slot in my tests). On general list including multi-word queries, gets 80%+ correct. - Multi-word: Handles and corrects multi-word queries such as "harrypotter" -> "harry potter". - Fast: In my tests, builds the dictionary more than 30 times faster. - Small: Dictionary size is roughly a third of that built by the existing engine. - Other bells and whistles... There is already a standalone test program that people can try out, and we're interested in feedback. If you're interested in discussing, testing, or previewing, consider joining the Google group: http://groups.google.com/group/spelt/ --Martin
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]