Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread John Machin
On Feb 7, 10:37 pm, [EMAIL PROTECTED] wrote: > > On Wed, 06 Feb 2008 17:32:53 -0600, Robert Kern wrote: > > > > Jeff Schwab wrote: > > ... > > >> If the strings happen to be the same length, the Levenshtein distance > > >> is equivalent to the Hamming distance. > > Is this really what the OP was as

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread Gabriel Genellina
En Thu, 07 Feb 2008 13:25:14 -0200, [EMAIL PROTECTED] <[EMAIL PROTECTED]> escribió: > Many thanks for the excellent leads. I've also found several > functions to find phonetic similarity between English names: the > mentioned above soundex, then, also, one called metaphone. I'm now > thinking

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread Guilherme Polo
2008/2/7, [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > On Feb 7, 2:37 am, "Daniel Fetchinson" <[EMAIL PROTECTED]> > wrote: > > > Hi folks, just went through this thread and a related one from 2006 > > and I was wondering what the best solution is for using these string > > metrics in a database sear

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread [EMAIL PROTECTED]
On Feb 7, 2:37 am, "Daniel Fetchinson" <[EMAIL PROTECTED]> wrote: > Hi folks, just went through this thread and a related one from 2006 > and I was wondering what the best solution is for using these string > metrics in a database search. If I want to query the database for a > string or something

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread [EMAIL PROTECTED]
Hi, All: Many thanks for the excellent leads. I've also found several functions to find phonetic similarity between English names: the mentioned above soundex, then, also, one called metaphone. I'm now thinking of the best way to use some combination of these functions. -- http://mail.python.or

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread JKPeck
On Feb 7, 6:11 am, Lee Capps <[EMAIL PROTECTED]> wrote: > At 14:01 Wed 06 Feb 2008, [EMAIL PROTECTED] wrote: > > >Are there any Python libraries implementing measurement of similarity > >of two strings of Latin characters? > > >I'm writing a script to guess-merge two tables based on people's > >nam

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread Steve Holden
[EMAIL PROTECTED] wrote: > > > > > > >> On Wed, 06 Feb 2008 17:32:53 -0600, Robert Kern wrote: >> >>> Jeff Schwab wrote: >> ... If the strings happen to be the same length, the Levenshtein distance is equivalent to the Hamming distance. > > Is this really what the OP was asking for

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread Lee Capps
At 14:01 Wed 06 Feb 2008, [EMAIL PROTECTED] wrote: >Are there any Python libraries implementing measurement of similarity >of two strings of Latin characters? > >I'm writing a script to guess-merge two tables based on people's >names, which are not necessarily spelled the same way in both tables >(

Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread Matthew_WARREN
> On Wed, 06 Feb 2008 17:32:53 -0600, Robert Kern wrote: > > > Jeff Schwab wrote: > ... > >> If the strings happen to be the same length, the Levenshtein distance > >> is equivalent to the Hamming distance. Is this really what the OP was asking for. If I understand it correctly, Levenshtein

Re: Looking for library to estimate likeness of two strings

2008-02-06 Thread Daniel Fetchinson
> Are there any Python libraries implementing measurement of similarity > of two strings of Latin characters? > > I'm writing a script to guess-merge two tables based on people's > names, which are not necessarily spelled the same way in both tables > (especially the given names). I would like som

Re: Looking for library to estimate likeness of two strings

2008-02-06 Thread Jeff Schwab
Steven D'Aprano wrote: > On Wed, 06 Feb 2008 17:32:53 -0600, Robert Kern wrote: > >> Jeff Schwab wrote: > ... >>> If the strings happen to be the same length, the Levenshtein distance >>> is equivalent to the Hamming distance. > ... >> I'm afraid that it isn't. Using Magnus Lie Hetland's implement

Re: Looking for library to estimate likeness of two strings

2008-02-06 Thread Steven D'Aprano
On Wed, 06 Feb 2008 17:32:53 -0600, Robert Kern wrote: > Jeff Schwab wrote: ... >> If the strings happen to be the same length, the Levenshtein distance >> is equivalent to the Hamming distance. ... > I'm afraid that it isn't. Using Magnus Lie Hetland's implementation: ... > In [4]: hamdist('abcde

Re: Looking for library to estimate likeness of two strings

2008-02-06 Thread Robert Kern
Jeff Schwab wrote: > Tim Chase wrote: >>> Are there any Python libraries implementing measurement of similarity >>> of two strings of Latin characters? >> It sounds like you're interested in calculating the Levenshtein distance: >> >> http://en.wikipedia.org/wiki/Levenshtein_distance >> >> which gi

Re: Looking for library to estimate likeness of two strings

2008-02-06 Thread Jeff Schwab
Tim Chase wrote: >> Are there any Python libraries implementing measurement of similarity >> of two strings of Latin characters? > > It sounds like you're interested in calculating the Levenshtein distance: > > http://en.wikipedia.org/wiki/Levenshtein_distance > > which gives you a measure of ho

Re: Looking for library to estimate likeness of two strings

2008-02-06 Thread Tim Chase
> Are there any Python libraries implementing measurement of similarity > of two strings of Latin characters? It sounds like you're interested in calculating the Levenshtein distance: http://en.wikipedia.org/wiki/Levenshtein_distance which gives you a measure of how different they are. A measu

Looking for library to estimate likeness of two strings

2008-02-06 Thread [EMAIL PROTECTED]
Are there any Python libraries implementing measurement of similarity of two strings of Latin characters? I'm writing a script to guess-merge two tables based on people's names, which are not necessarily spelled the same way in both tables (especially the given names). I would like some function