On Mar 29, 7:22 am, "aspineux" <[EMAIL PROTECTED]> wrote: > I want to parse > > '[EMAIL PROTECTED]' or '<[EMAIL PROTECTED]>' and get the email address [EMAIL > PROTECTED] > > the regex is > > r'<[EMAIL PROTECTED]>|[EMAIL PROTECTED]' > > now, I want to give it a name > > r'<(?P<email>[EMAIL PROTECTED])>|(?P<email>[EMAIL PROTECTED])' > > sre_constants.error: redefinition of group name 'email' as group 2; > was group 1 > > BUT because I use a | , I will get only one group named 'email' ! > > Any comment ? > > PS: I know the solution for this case is to use r'(?P<lt><)?(?P<email> > [EMAIL PROTECTED])(?(lt)>)'
Regular expressions, alternation, named groups ... oh my! It tends to get quite complex especially if you need to reject cases where the string contains a left bracket and not the right, or visa-versa. >>> pattern = re.compile(r'(?P<email><[EMAIL PROTECTED]>|(?<!<)[EMAIL >>> PROTECTED](?!>))') >>> for email in ('[EMAIL PROTECTED]' , '<[EMAIL PROTECTED]>', '<[EMAIL >>> PROTECTED]'): ... matched = pattern.search(email) ... if matched is not None: ... print matched.group('email') ... [EMAIL PROTECTED] <[EMAIL PROTECTED]> I suggest you try some other solution (maybe pyparsing). -- Hope this helps, Steven -- http://mail.python.org/mailman/listinfo/python-list