[EMAIL PROTECTED] wrote: > > > > > > >> On Wed, 06 Feb 2008 17:32:53 -0600, Robert Kern wrote: >> >>> Jeff Schwab wrote: >> ... >>>> If the strings happen to be the same length, the Levenshtein distance >>>> is equivalent to the Hamming distance. > > Is this really what the OP was asking for. If I understand it correctly, > Levenshtein distance works out the number of edits required to transform > the string to the target string. The smaller the more equivalent, but with > the OP's problem I would expect > > > table1 table2 > brian briam > erian > > > I think the OP would like to guess at 'briam' rather than 'erian', but > Levenstein would rate them equally good guesses? > > I know this is pushing it more toward phonetic alaysis of the words or > something similar, and thats orders of magnitude more complex. > > just in case, > > http://www.linguistlist.org/sp/Software.html#97 > > might be a good place to start looking into it, along with the NLTK > libraries here > > http://nltk.sourceforge.net/index.php/Documentation > You could perhaps use soundex to try to choose between different possibilities with the same Levenshtein distance from the sample. Soundex by itself is horrible, but it might work as a prioritizer.
regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list