On Mon, 11 Jan 2010 13:51:48 -0800, Chris Rebert wrote: > On Mon, Jan 11, 2010 at 12:34 PM, Steven D'Aprano > <st...@remove-this-cybersource.com.au> wrote: <snip> >> If you can avoid regexes in favour of ordinary string methods, do so. >> In general, something like: >> >> source.replace(target, new) >> >> will potentially be much faster than: >> >> regex = re.compile(target) >> regex.sub(new, source) >> # equivalent to re.sub(target, new, source) >> >> (assuming of course that target is just a plain string with no regex >> specialness). If you're just cracking a peanut, you probably don't need >> the 30 lb sledgehammer of regular expressions. > > Of course, but is the regex library really not smart enough to > special-case and optimize vanilla string substitutions?
Apparently not in Python 2.5: >>> from timeit import Timer >>> t1 = Timer('x.sub("Dutch", "Nobody expects the Spanish Inquisition!")', ... 'from re import compile; x = compile("Spanish")') >>> t2 = Timer('x.replace("Spanish", "Dutch")', ... 'x="Nobody expects the Spanish Inquisition!"') >>> >>> t1.repeat() [3.7209370136260986, 2.7262279987335205, 2.6416280269622803] >>> t2.repeat() [2.2915709018707275, 1.2584249973297119, 1.2730350494384766] Even if it did, I wouldn't rely on that sort of special casing unless the language guaranteed it. Keep in mind that regexes are essentially a programming language (although not Turing Complete), and the engine implementation may choose purity and simplicity over such optimizations. -- Steven -- http://mail.python.org/mailman/listinfo/python-list