On Feb 7, 11:18 pm, LaundroMat <laun...@gmail.com> wrote: > Hi, > > I'm quite new to regular expressions, and I wonder if anyone here > could help me out. > > I'm looking to split strings that ideally look like this: "Update: New > item (Household)" into a group. > This expression works ok: '^(Update:)?(.*)(\(.*\))$' - it returns > ("Update", "New item", "(Household)") > > Some strings will look like this however: "Update: New item (item) > (Household)". The expression above still does its job, as it returns > ("Update", "New item (item)", "(Household)"). > > It does not work however when there is no text in parentheses (eg > "Update: new item"). How can I get the expression to return a tuple > such as ("Update:", "new item", None)?
I don't see how it can be done without some post-matching adjustment. Try this: C:\junk>type mathieu.py import re tests = [ "Update: New item (Household)", "Update: New item (item) (Household)", "Update: new item", "minimal", "parenthesis (plague) (has) (struck)", ] regex = re.compile(""" (Update:)? # optional prefix \s* # ignore whitespace ([^()]*) # any non-parentheses stuff (\([^()]*\))? # optional (blahblah) \s* # ignore whitespace (\([^()]*\))? # another optional (blahblah) $ """, re.VERBOSE) for i, test in enumerate(tests): print "Test #%d: %r" % (i, test) m = regex.match(test) if not m: print "No match" else: g = m.groups() print g if g[3] is not None: x = (g[0], g[1] + g[2], g[3]) else: x = g[:3] print x print C:\junk>mathieu.py Test #0: 'Update: New item (Household)' ('Update:', 'New item ', '(Household)', None) ('Update:', 'New item ', '(Household)') Test #1: 'Update: New item (item) (Household)' ('Update:', 'New item ', '(item)', '(Household)') ('Update:', 'New item (item)', '(Household)') Test #2: 'Update: new item' ('Update:', 'new item', None, None) ('Update:', 'new item', None) Test #3: 'minimal' (None, 'minimal', None, None) (None, 'minimal', None) Test #4: 'parenthesis (plague) (has) (struck)' No match HTH, John -- http://mail.python.org/mailman/listinfo/python-list