Gabriel Genellina wrote: > At Tuesday 26/12/2006 18:08, John Machin wrote:
> > > > I'm looking for a module to do fuzzy comparison of strings. [...] > Other alternatives: trigram, n-gram, Jaro's distance. There are some > Python implem. available. Quick question, you mentioned the data you need to run comparisons on is stored in a database. Is this string comparison a one-time processing kind of thing to clean up the data, or are you going to have to continually do fuzzy string comparison on the data in the database? There are some papers out there on implementing n-gram string comparisons completely in SQL so that you don't have to pull back all the data in your tables in order to do fuzzy comparisons. I can drum up some code I did a while ago and post it (in java). -- http://mail.python.org/mailman/listinfo/python-list