On Feb 8, 1:37 am, MRAB <goo...@mrabarnett.plus.com> wrote: > LaundroMat wrote: > > Hi, > > > I'm quite new to regular expressions, and I wonder if anyone here > > could help me out. > > > I'm looking to split strings that ideally look like this: "Update: New > > item (Household)" into a group. > > This expression works ok: '^(Update:)?(.*)(\(.*\))$' - it returns > > ("Update", "New item", "(Household)") > > > Some strings will look like this however: "Update: New item (item) > > (Household)". The expression above still does its job, as it returns > > ("Update", "New item (item)", "(Household)").
Not quite true; it actually returns ('Update:', ' New item (item) ', '(Household)') However ignoring the difference in whitespace, the OP's intention is clear. Yours returns ('Update:', ' New item ', '(item) (Household)') > > It does not work however when there is no text in parentheses (eg > > "Update: new item"). How can I get the expression to return a tuple > > such as ("Update:", "new item", None)? > > You need to make the last group optional and also make the middle group > lazy: r'^(Update:)?(.*?)(?:(\(.*\)))?$'. Why do you perpetuate the redundant ^ anchor? > (?:...) is the non-capturing version of (...). Why do you use (?:(subpattern))? instead of just plain (subpattern)? ? -- http://mail.python.org/mailman/listinfo/python-list