On 2013-10-26 21:54, Roy Smith wrote: > In article <mailman.1628.1382838024.18130.python-l...@python.org>, > Tim Chase <python.l...@tim.thechases.com> wrote: >> I'd be just as happy if Python provided a "sloppy string compare" >> that ignored case, diacritical marks, and the like. > > The problem with putting fuzzy matching in the core language is > that there is no general agreement on how it's supposed to work. > > There are, however, third-party libraries which do fuzzy matching. > One popular one is jellyfish > (https://pypi.python.org/pypi/jellyfish/0.1.2).
Bookmarking and archiving your email for future reference. > Don't expect you can just download and use it right out of the box, > however. You'll need to do a little thinking about which of the > several algorithms it includes makes sense for your application. I'd be content with a baseline that denormalizes and then strips out combining diacritical marks, something akin to MRAB's from unicodedata import normalize "".join(c for c in normalize("NFKD", s) if ord(c) < 0x80) and tweaking it if that was insufficient. Thanks for the link to Jellyfish. -tkc -- https://mail.python.org/mailman/listinfo/python-list