Somehow I seem to have missed (and can't find) your original mail, but it seems like you're asking about using double metaphone for place names. We've done this on our site (http://boston.povo.com) for street and place names, and I can't say we've been happy with the results. We're toying with ngrams instead, but neither is "well tuned" to the particular types of mistakes people seem to make for place names, and I haven't really found or figured out something that is.
-----Original Message----- From: Daniel Noll [mailto:dan...@nuix.com] Sent: Thursday, December 18, 2008 12:45 AM To: java-user@lucene.apache.org Subject: Re: double metaphone for misspellings Geoff Hendrey wrote: > ((POINameType)name).getText().split("\\s"); //tokenize manually. (gosh, > I thought the analyser would do this) The analyser does do this... but related to this, the Right Way to do it in your case would be to write your own analyser specifically for that field, and do all the metaphone magic in the analyser. You probably want an analyser which chains onto the StandardAnalyzer and adds an additional token filter to do the double metaphone stuff. Writing a token filter isn't too hard, PorterStemFilter is a relatively good example of doing something similar. So you would end up with a DoubleMetaphoneFilter, which you could then use with PerFieldAnalyzerWrapper to have it apply only to the fields you use that for. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org