On Jan 11, 3:35 pm, Jeremy <jlcon...@gmail.com> wrote: > Hello all, > > I am using re.split to separate some text into logical structures. > The trouble is that re.split doesn't find everything while re.findall > does; i.e.: > > > > > found = re.findall('^ 1', line, re.MULTILINE) > > len(found) > 6439 > > tables = re.split('^ 1', line, re.MULTILINE) > > len(tables) > > 1 > > Can someone explain why these two commands are giving different > results? I thought I should have the same number of matches (or maybe > different by 1, but not 6000!) > > Thanks, > Jeremy
re.split doesn't take re.MULTILINE as a flag: it doesn't take any flags. It does take a maxsplit parameter, which you are passing the value of re.MULTILINE (which happens to be 8 in my implementation). Since your pattern is looking for line starts, and your first line presumably has more splits than the maxsplits you are specifying, your re.split never finds more than 1. >>> a 'split(pattern, string, maxsplit=0)\n Split the source string by the occurren ces of the pattern,\n returning a list containing the resulting substrings.\n ' >>> re.split(" ", a, re.MULTILINE) ['split(pattern,', 'string,', 'maxsplit=0)\n', '', '', '', 'Split', 'the', 'sour ce string by the occurrences of the pattern,\n returning a list containing th e resulting substrings.\n'] >>> re.split(" ", a) ['split(pattern,', 'string,', 'maxsplit=0)\n', '', '', '', 'Split', 'the', 'sour ce', 'string', 'by', 'the', 'occurrences', 'of', 'the', 'pattern,\n', '', '', '' , 'returning', 'a', 'list', 'containing', 'the', 'resulting', 'substrings.\n'] Iain -- http://mail.python.org/mailman/listinfo/python-list