--- On Thu, 11/25/10, Phlip <phlip2...@gmail.com> wrote: > From: Phlip <phlip2...@gmail.com> > Subject: a regexp riddle: re.search(r' > To: python-list@python.org > Date: Thursday, November 25, 2010, 8:46 AM > HypoNt: > > I need to turn a human-readable list into a list(): > > print re.search(r'(?:(\w+), |and > (\w+))+', 'whatever a, bbb, and > c').groups() > > That currently returns ('c',). I'm trying to match "any > word \w+ > followed by a comma, or a final word preceded by and." > > The match returns 'a, bbb, and c', but the groups return > ('bbb', 'c'). > What do I type for .groups() to also get the 'a'? >
First of all, the 'bbb' coresponds to the first capturing group and 'c' the second. But 'a' is forgotten be cause it was the first match of the first group, but there is a second match 'bbb'. Generally, a capturing group only remembers the last match. It also seems that your re may match this: 'and c', which does not seem to be your intention. So it may be more intuitively written as: r'(?:(\w+), )+and (\w+)' I'm not sure how to get it done in one step, but it would be easy to first get the whole match, then process it with: re.findall(r'(\w+)(?:,|$)', the_whole_match) cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list