Op donderdag 21 januari 2010 schreef MRAB: > For longest first you need: > > keys = sorted(mapping.keys(), key=len, reverse=True)
Oh yes, I cut/pasted the wrong line :-) Just for clarity: import re mapping = { "foo" : "bar", "baz" : "quux", "quuux" : "foo" } # sort the keys, longest first, so 'aa' gets matched before 'a', because # in Python regexps the first match (going from left to right) in a # |-separated group is taken keys = sorted(mapping.keys(), key=len, reverse=True) rx = re.compile("|".join(keys)) repl = lambda x: mapping[x.group()] s = "fooxxxbazyyyquuux" rx.sub(repl, s) >> One thing remaining: if the replacement keys could contain non-alphanumeric >> characters, they should be escaped using re.escape: >> rx = re.compile("|".join(re.escape(key) for key in keys)) >> >Strictly speaking, not all non-alphanumeric characters, but only the >special ones. True, although the re.escape function simply escapes all non-alphanumeric characters :) And here is a factory function that returns a translator given a mapping. The translator can be called to perform replacements in a string: import re def translator(mapping): keys = sorted(mapping.keys(), key=len, reverse=True) rx = re.compile("|".join(keys)) repl = lambda m: mapping[m.group()] return lambda s: rx.sub(repl, s) #Usage: >>> t = translator(mapping) >>> t('fooxxxbazyyyquuux') 'barxxxquuxyyyfoo' w best regards, Wilbert Berendsen -- http://www.wilbertberendsen.nl/ -- http://mail.python.org/mailman/listinfo/python-list