Re: Different number of matches from re.findall and re.split

MRAB Mon, 11 Jan 2010 08:11:20 -0800

Jeremy wrote:

On Jan 11, 8:44 am, Iain King <[email protected]> wrote:

On Jan 11, 3:35 pm, Jeremy <[email protected]> wrote:

Hello all,
I am using re.split to separate some text into logical structures.
The trouble is that re.split doesn't find everything while re.findall
does; i.e.:

found = re.findall('^ 1', line, re.MULTILINE)
len(found)

tables = re.split('^ 1', line, re.MULTILINE)
len(tables)
1

Can someone explain why these two commands are giving different
results?  I thought I should have the same number of matches (or maybe
different by 1, but not 6000!)
Thanks,
Jeremy

re.split doesn't take re.MULTILINE as a flag: it doesn't take any
flags. It does take a maxsplit parameter, which you are passing the
value of re.MULTILINE (which happens to be 8 in my implementation).
Since your pattern is looking for line starts, and your first line
presumably has more splits than the maxsplits you are specifying, your
re.split never finds more than 1.


Yep.  Thanks for pointing that out.  I guess I just assumed that
re.split was similar to re.search/match/findall in what it accepted as
function parameters.  I guess I'll have to use a \n instead of a ^ for
split.

You could use the .split method of a pattern object instead:

    tables = re.compile('^ 1', re.MULTILINE).split(line)
--
http://mail.python.org/mailman/listinfo/python-list

Re: Different number of matches from re.findall and re.split

Reply via email to