Re: Regular expression fun. Repeated matching of a group Q

johnzenger Fri, 24 Feb 2006 08:42:13 -0800

You can check len(sanesplit) to see how big your list is.  If it is <
2, then there were no  <td>'s, so move on to the next line.


It is probably possible to do the whole thing with a regular
expression.  It is probably not wise to do so.  Regular expressions are
difficult to read, and, as you discovered, difficult to program and
debug.  In many cases, Python code that relies on regular expressions
for lots of program logic runs slower than code that uses normal
Python.

Suppose "words" contains all the words in English.  Compare these two
lines:

foobarwords1 = [x for x in words if re.search("foo|bar", x) ]
foobarwords2 = [x for x in words if "foo" in x or "bar" in x ]

I haven't tested this with 2.4, but as of a few years ago it was a safe
bet that foobarwords2 will be calculated much, much faster.  Also, I
think you will agree, foobarwords2 is a lot easier to read.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression fun. Repeated matching of a group Q

Reply via email to