On Jan 21, 2:18 pm, Wilbert Berendsen <wbs...@xs4all.nl> wrote: > Op maandag 18 januari 2010 schreef Adi: > > > keys = [(len(key), key) for key in mapping.keys()] > > keys.sort(reverse=True) > > keys = [key for (_, key) in keys] > > > pattern = "(%s)" % "|".join(keys) > > repl = lambda x : mapping[x.group(1)] > > s = "fooxxxbazyyyquuux" > > > re.subn(pattern, repl, s) > > I managed to make it even shorted, using the key argument for sorted, not > putting the whole regexp inside parentheses and pre-compiling the regular > expression: > > import re > > mapping = { > "foo" : "bar", > "baz" : "quux", > "quuux" : "foo" > > } > > # sort the keys, longest first, so 'aa' gets matched before 'a', because > # in Python regexps the first match (going from left to right) in a > # |-separated group is taken > keys = sorted(mapping.keys(), key=len) > > rx = re.compile("|".join(keys)) > repl = lambda x: mapping[x.group()] > s = "fooxxxbazyyyquuux" > rx.sub(repl, s) > > One thing remaining: if the replacement keys could contain non-alphanumeric > characters, they should be escaped using re.escape: > > rx = re.compile("|".join(re.escape(key) for key in keys)) > > Met vriendelijke groet, > Wilbert Berendsen > > --http://www.wilbertberendsen.nl/ > "You must be the change you wish to see in the world." > -- Mahatma Gandhi
Sorting it isn't the right solution: easier to hold the subs as tuple pairs and by doing so let the user specify order. Think of the following subs: "fooxx" -> "baz" "oxxx" -> "bar" does the user want "bazxbazyyyquuux" or "fobarbazyyyquuux"? Iain -- http://mail.python.org/mailman/listinfo/python-list