RE: double metaphone for misspellings

2008-12-18 Thread Geoff Hendrey
an...@nuix.com] Sent: Thursday, December 18, 2008 12:45 AM To: java-user@lucene.apache.org Subject: Re: double metaphone for misspellings Geoff Hendrey wrote: > ((POINameType)name).getText().split("\\s"); //tokenize manually. (gosh, > I thought the analyser would do this) The analys

RE: double metaphone for misspellings

2008-12-18 Thread Max Metral
Thursday, December 18, 2008 12:45 AM To: java-user@lucene.apache.org Subject: Re: double metaphone for misspellings Geoff Hendrey wrote: > ((POINameType)name).getText().split("\\s"); //tokenize manually. (gosh, > I thought the analyser would do this) The analyser does do this... bu

Re: double metaphone for misspellings

2008-12-17 Thread Daniel Noll
Geoff Hendrey wrote: ((POINameType)name).getText().split("\\s"); //tokenize manually. (gosh, I thought the analyser would do this) The analyser does do this... but related to this, the Right Way to do it in your case would be to write your own analyser specifically for that field, and do all

double metaphone for misspellings

2008-12-17 Thread Geoff Hendrey
Apache commons codec library has double metaphone algorithm. I tried a series of experiments around storing the double metaphone representations of strings in the index itself, and searching using doublemetaphone version of search terms when the field I am searching against is stored as double meta