Op maandag 18 januari 2010 schreef Adi: > keys = [(len(key), key) for key in mapping.keys()] > keys.sort(reverse=True) > keys = [key for (_, key) in keys] > > pattern = "(%s)" % "|".join(keys) > repl = lambda x : mapping[x.group(1)] > s = "fooxxxbazyyyquuux" > > re.subn(pattern, repl, s)
I managed to make it even shorted, using the key argument for sorted, not putting the whole regexp inside parentheses and pre-compiling the regular expression: import re mapping = { "foo" : "bar", "baz" : "quux", "quuux" : "foo" } # sort the keys, longest first, so 'aa' gets matched before 'a', because # in Python regexps the first match (going from left to right) in a # |-separated group is taken keys = sorted(mapping.keys(), key=len) rx = re.compile("|".join(keys)) repl = lambda x: mapping[x.group()] s = "fooxxxbazyyyquuux" rx.sub(repl, s) One thing remaining: if the replacement keys could contain non-alphanumeric characters, they should be escaped using re.escape: rx = re.compile("|".join(re.escape(key) for key in keys)) Met vriendelijke groet, Wilbert Berendsen -- http://www.wilbertberendsen.nl/ "You must be the change you wish to see in the world." -- Mahatma Gandhi -- http://mail.python.org/mailman/listinfo/python-list