[EMAIL PROTECTED] wrote: > This algorithm is called soundex. Here is one implementation example. > > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52213 > > here is another: > http://effbot.org/librarybook/soundex.htm
Soundex is *one* particular algorithm for approximate string matching. It is optimised for matching Anglo-American names (like Smith/Smythe), and is considered to be quite old and obsolete for all but the most trivial applications -- or so I'm told. Soundex will not match arbitrary changes -- it will match both cat and cet, but it won't match cat and mat. A more sophisticated approximate string matching algorithm will use the Levenshtein distance. You can find a Useless implementation here: http://www.uselesspython.com/download.php?script_id=108 Given a function levenshtein(s1, s2) that returns the distance between two strings, you could use it for approximate matching like this: def approx_matching(strlist, target, dist=1): """Matches approximately strings in strlist to a target string. Returns a list of strings, where each string matched is no further than an edit distance of dist from the target. """ found = [] for s in strlist: if levenshtein(s, target) <= dist: found.append(s) return s -- Steven. -- http://mail.python.org/mailman/listinfo/python-list