> Your regex says "Zero or more consecutive occurrences of > something, always returning the most possible". That's > what it does, at every position - only matching emptyness > where it couldn't match anything (findall then skips a > character to avoid overlapping/infinite empty > matches), and at all other times matching the most > possible (eg. "has a lam" not "has", " a ", "lam").
You are about to convince me now. You are correct for the regex '(.a.)*'. What I thought was for this regex: '((.a.)*)*', I confused myself when I added an enclosing (). Could you please reconsider how would you work with this new one and see if my steps are correct? If you agree with my 7-step execution for the new regex, then: We finally found a real bug for re.findall: >>> re.findall('((.a.)*)*', 'Mary has a lamb') [('', 'Mar'), ('', ''), ('', ''), ('', 'lam'), ('', ''), ('', '')] Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list