On Mon, 11 Jan 2010 11:20:34 -0800, Jeremy wrote: > I just profiled one of my Python scripts
Well done! I'm not being sarcastic, or condescending, but you'd be AMAZED (or possibly not...) at how many people try to optimize their scripts *without* profiling, and end up trying to speed up parts of the code that don't matter while ignoring the actual bottlenecks. > and discovered that >99% of the time was spent in > > {built-in method sub} > > What is this function You don't give us enough information to answer with anything more than a guess. You know what is in your scripts, we don't. I can do this: >>> sub Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'sub' is not defined So it's not a built-in function. Nor do strings have a sub method. So I'm reduced to guessing. Based on your previous post, you're probably using regexes, so: >>> import re >>> type(re.sub) <type 'function'> Getting closer, but that's a function, not a method. >>> type(re.compile("x").sub) <type 'builtin_function_or_method'> That's probably the best candidate: you're probably calling the sub method on a pre-compiled regular expression object. As for the second part of your question: > and is there a way to optimize it? I think you'll find that Python's regex engine is pretty much optimised as well as it can be, short of a major re-write. But to quote Jamie Zawinski: Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. The best way to optimize regexes is to use them only when necessary. They are inherently an expensive operation, a mini-programming language of it's own. Naturally some regexes are more expensive than others: some can be *really* expensive, some are not. If you can avoid regexes in favour of ordinary string methods, do so. In general, something like: source.replace(target, new) will potentially be much faster than: regex = re.compile(target) regex.sub(new, source) # equivalent to re.sub(target, new, source) (assuming of course that target is just a plain string with no regex specialness). If you're just cracking a peanut, you probably don't need the 30 lb sledgehammer of regular expressions. Otherwise, we'd need to see the actual regexes that you are using in order to comment on how you might optimize them. -- Steven -- http://mail.python.org/mailman/listinfo/python-list