On Mar 29, 7:22 am, "aspineux" <[EMAIL PROTECTED]> wrote:
> I want to parse
>
> '[EMAIL PROTECTED]' or '<[EMAIL PROTECTED]>' and get the email address [EMAIL 
> PROTECTED]
>
> the regex is
>
> r'<[EMAIL PROTECTED]>|[EMAIL PROTECTED]'
>
> now, I want to give it a name
>
> r'<(?P<email>[EMAIL PROTECTED])>|(?P<email>[EMAIL PROTECTED])'
>
> sre_constants.error: redefinition of group name 'email' as group 2;
> was group 1
>
> BUT because I use a | , I will get only one group named 'email' !
>
> Any comment ?
>
> PS: I know the solution for this case is to use  r'(?P<lt><)?(?P<email>
> [EMAIL PROTECTED])(?(lt)>)'


Regular expressions, alternation, named groups ... oh my!

It tends to get quite complex especially if you need
to reject cases where the string contains a left bracket
and not the right, or visa-versa.

>>> pattern = re.compile(r'(?P<email><[EMAIL PROTECTED]>|(?<!<)[EMAIL 
>>> PROTECTED](?!>))')
>>> for email in ('[EMAIL PROTECTED]' , '<[EMAIL PROTECTED]>', '<[EMAIL 
>>> PROTECTED]'):
...     matched = pattern.search(email)
...     if matched is not None:
...         print matched.group('email')
...
[EMAIL PROTECTED]
<[EMAIL PROTECTED]>


I suggest you try some other solution (maybe pyparsing).

--
Hope this helps,
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to