On 11/24/2010 10:46 PM, Phlip wrote: > HypoNt: > > I need to turn a human-readable list into a list(): > > print re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and > c').groups() > > That currently returns ('c',). I'm trying to match "any word \w+ > followed by a comma, or a final word preceded by and." > > The match returns 'a, bbb, and c', but the groups return ('bbb', 'c'). > What do I type for .groups() to also get the 'a'? > > Please go easy on me (and no RTFM!), because I have only been using > regular expressions for about 20 years...
A kind of lazy way just uses a pattern for the separators to fuel a call to re.split(). I assume that " and " and " , " are both acceptable in any position: The best I've been able to do so far (due to split's annoying habit of including the matches of any groups in the pattern I have to throw away every second element) is: >>> re.split("\s*(,|and)?\s*", 'whatever a, bbb, and c')[::2] ['whatever', 'a', 'bbb', '', 'c'] That empty string is because of the ", and" which isn't recognise as a single delimiter. A parsing package might give you better results. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon 2011 Atlanta March 9-17 http://us.pycon.org/ See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list