Re: make RE more cleaver to avoid inappropriate : sre_constants.error: redefinition of group name

Paddy Fri, 30 Mar 2007 08:21:10 -0800

On Mar 30, 1:44 pm, "aspineux" <[EMAIL PROTECTED]> wrote:
> On 30 mar, 00:13, "Paddy" <[EMAIL PROTECTED]> wrote:
>
> > On Mar 29, 3:22 pm, "aspineux" <[EMAIL PROTECTED]> wrote:
>
> > > I want to parse
>
> > > '[EMAIL PROTECTED]' or '<[EMAIL PROTECTED]>' and get the email address 
> > > [EMAIL PROTECTED]
>
> > > the regex is
>
> > > r'<[EMAIL PROTECTED]>|[EMAIL PROTECTED]'
>
> > > now, I want to give it a name
>
> > > r'<(?P<email>[EMAIL PROTECTED])>|(?P<email>[EMAIL PROTECTED])'
>
> > > sre_constants.error: redefinition of group name 'email' as group 2;
> > > was group 1
>
> > > BUT because I use a | , I will get only one group named 'email' !
>
> > > Any comment ?
>
> > > PS: I know the solution for this case is to use  r'(?P<lt><)?(?P<email>
> > > [EMAIL PROTECTED])(?(lt)>)'
>
> > use two group names, one for each alternate form and if you are not
> > concerned with whichever matched do something like the following:
>
> The problem is the way I create this regex :-)
>
> regex={}
> regex['email']=r'(?P<email1>[EMAIL PROTECTED])'
>
> path=r'<%(email)s>|%(email)s' % regex
>
> Once more, the original question is :
> Is it normal to get an error when the same id used on both side of a
> |
>
>
>
> > >>> s1 = '[EMAIL PROTECTED]'
> > >>> s2 = '<[EMAIL PROTECTED]>'
> > >>> matchobj = re.search(r'<(?P<email1>[EMAIL 
> > >>> PROTECTED])>|(?P<email2>[EMAIL PROTECTED])', s1)
> > >>> matchobj.groupdict()['email1'] or matchobj.groupdict()['email2']
> > '[EMAIL PROTECTED]'
> > >>> matchobj = re.search(r'<(?P<email1>[EMAIL 
> > >>> PROTECTED])>|(?P<email2>[EMAIL PROTECTED])', s2)
> > >>> matchobj.groupdict()['email1'] or matchobj.groupdict()['email2']
> > '[EMAIL PROTECTED]'
>
> > - Paddy.


Groups are numbered left-to-right irrespective of the expression
contents.
I am quite happy with the names being merely apseudonym for the
positional
group number and don't see a problem with not allowing multiple
occurrences of  the same group name.
I did see some article about RE's and their speed. It seems that if
Pythons
RE package distinguished between 'grep style' RE' and the full set of
Python
RE's then their are much faster and efficient algorithms available for
the
grep style subset.

- Paddy.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: make RE more cleaver to avoid inappropriate : sre_constants.error: redefinition of group name

Reply via email to