[issue33647] Add re.replace(string, replacement_map)

Serhiy Storchaka Sat, 26 May 2018 00:42:07 -0700

Serhiy Storchaka <storchaka+cpyt...@gmail.com> added the comment:

I'm -1 of adding support of this in str.replace. This is very non-trivial code, 
and unicodeobject.c is already one of largest and most complex files. Adding 
new complex code will make maintaining harder and can make the compiler 
producing less optimal code for other methods. str.replace is already good 
optimized, it is often better to call it several times than use other methods 
(regular expressions or str.translate).


You should be careful with sequential applying str.replace() if some keys are 
prefixes of other keys ({'a': 'x', 'ab': 'y'}). You should perform replacement 
in correct order. But this doesn't work either in cases like {'a': 'b', 'b': 
'a'}.

The regular expression based implementation should be more complex than Terry's 
example:

def re_replace(string, mapping):
    def repl(m):
        return mapping[m[0]]
    pattern = '|'.join(map(re.escape, sorted(mapping, reverse=True)))
    return re.sub(pattern, repl, string)

And it will be very inefficient, because creating and compiling a pattern is 
much slower than performing the replacement itself, and it can't be cached. 
This function would be not very useful for practical purposes. You will need to 
split it on two parts. First prepare a compiled pattern:

    def repl(m):
        return mapping[m[0]]
    compiled_pattern = re.compile('|'.join(map(re.escape, sorted(mapping, 
reverse=True))))

And later use it:

    newstring = compiled_pattern.sub(repl, string)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33647>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33647] Add re.replace(string, replacement_map)

Reply via email to