On Mon, Jul 26, 2010 at 1:13 PM, <[email protected]> wrote: > > What I want to capture is situations where people misspell things in > roughly a phonetic way. For example, “Tchaikovsky Avenue” might be > misspelled as “Chicovsky Avenue”. Modules that do phonetic mapping are > possible but you’d have to somehow generate a phonetic database of (say) > streetnames, worldwide. Good luck on getting hold of that kind of data > anywhere. ;-) In the absence of such data, an LD distance will have to do – > but it will almost certainly need to be greater than 2. > I added this to 'TestPhoneticFilter' and it passes: assertAlgorithm(new DoubleMetaphone(), false, "Tchaikovsky Chicovsky", new String[] { "XKFS", "XKFS" });
So if you want to give me all your street names, i can sell you a phonetic database, or you can use the filters in modules/analyzers/phonetic, which have a bunch of different configurable algorithms :) -- Robert Muir [email protected]
