On Apr 7, 4:22 pm, Jesse Aldridge <[EMAIL PROTECTED]> wrote: > > > changing "( " to "(" and " )" to ")". > > Changed.
But then you introduced more. > > I attempted to take out everything that could be trivially implemented > with the standard library. > This has left me with... 4 functions in S.py. 1 one of them is used > internally, and the others aren't terribly awesome :\ But I think the > ones that remain are at least a bit useful :) If you want to look at stuff that can't be implemented trivially using str/unicode methods, and is more than a bit useful, google for mxTextTools. > > > A basic string normalisation-before-comparison function would > > usefully include replacing multiple internal whitespace characters by > > a single space. > > I added this functionality. Not quite. I said "whitespace", not "space". The following is the standard Python idiom for removing leading and trailing whitespace and replacing one or more whitespace characters with a single space: def normalise_whitespace(s): return ' '.join(s.split()) If your data is obtained by web scraping, you may find some people use '\xA0' aka NBSP to pad out fields. The above code will get rid of these if s is unicode; if s is str, you need to chuck a .replace('\xA0', ' ') in there somewhere. HTH, John -- http://mail.python.org/mailman/listinfo/python-list