On 07/19/2013 05:44 PM, Devyn Collier Johnson wrote:
On 07/19/2013 12:22 PM, Steven D'Aprano wrote:
On Fri, 19 Jul 2013 09:22:48 -0400, Devyn Collier Johnson wrote:
I have some code that I want to simplify. I know that a for-loop would
work well, but can I make re.sub perform all of the below tasks at once,
or can I write this in a way that is more efficient than using a
for-loop?
DATA = re.sub(',', '', 'DATA')
DATA = re.sub('\'', '', 'DATA')
DATA = re.sub('(', '', 'DATA')
DATA = re.sub(')', '', 'DATA')
I don't think you intended to put DATA in quotes on the right hand side.
That makes it literally the string D A T A, so all those replacements are
no-ops, and you could simplify it to:
DATA = 'DATA'
But that's probably not what you wanted.
My prediction is that this will be by far the most efficient way to do
what you are trying to do:
py> DATA = "Hello, 'World'()"
py> DATA.translate(dict.fromkeys(ord(c) for c in ",'()"))
'Hello World'
That's in Python 3 -- in Python 2, using translate will still probably be
the fastest, but you'll need to call it like this:
import string
DATA.translate(string.maketrans("", ""), ",'()")
I also expect that the string replace() method will be second fastest,
and re.sub will be the slowest, by a very long way.
As a general rule, you should avoiding using regexes unless the text you
are searching for actually contains a regular expression of some kind. If
it's merely a literal character or substring, standard string methods
will probably be faster.
Oh, and a tip for you:
- don't escape quotes unless you don't need to, use the other quote.
s = '\'' # No, don't do this!
s = "'" # Better!
and vice versa.
Thanks for finding that error; DATA should not be in quotes. I cannot
believe I missed that. Good eye Steven!
Using the replace command is a brilliant idea; I will implement that
where ever I can. I am wanting to perform all of the replaces at once.
Is that possible?
Read what you're quoting from. The translate() method does just that.
And maketrans() is the way to build a translate table.
On an Intel processor, the xlat instruction does a translate for one
character, and adding a REP in front of it does it for an entire buffer.
No idea if Python takes advantage of that, however.
--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list