On Mar 30, 1:44 pm, "aspineux" <[EMAIL PROTECTED]> wrote: > On 30 mar, 00:13, "Paddy" <[EMAIL PROTECTED]> wrote: > > > On Mar 29, 3:22 pm, "aspineux" <[EMAIL PROTECTED]> wrote: > > > > I want to parse > > > > '[EMAIL PROTECTED]' or '<[EMAIL PROTECTED]>' and get the email address > > > [EMAIL PROTECTED] > > > > the regex is > > > > r'<[EMAIL PROTECTED]>|[EMAIL PROTECTED]' > > > > now, I want to give it a name > > > > r'<(?P<email>[EMAIL PROTECTED])>|(?P<email>[EMAIL PROTECTED])' > > > > sre_constants.error: redefinition of group name 'email' as group 2; > > > was group 1 > > > > BUT because I use a | , I will get only one group named 'email' ! > > > > Any comment ? > > > > PS: I know the solution for this case is to use r'(?P<lt><)?(?P<email> > > > [EMAIL PROTECTED])(?(lt)>)' > > > use two group names, one for each alternate form and if you are not > > concerned with whichever matched do something like the following: > > The problem is the way I create this regex :-) > > regex={} > regex['email']=r'(?P<email1>[EMAIL PROTECTED])' > > path=r'<%(email)s>|%(email)s' % regex > > Once more, the original question is : > Is it normal to get an error when the same id used on both side of a > | > > > > > >>> s1 = '[EMAIL PROTECTED]' > > >>> s2 = '<[EMAIL PROTECTED]>' > > >>> matchobj = re.search(r'<(?P<email1>[EMAIL > > >>> PROTECTED])>|(?P<email2>[EMAIL PROTECTED])', s1) > > >>> matchobj.groupdict()['email1'] or matchobj.groupdict()['email2'] > > '[EMAIL PROTECTED]' > > >>> matchobj = re.search(r'<(?P<email1>[EMAIL > > >>> PROTECTED])>|(?P<email2>[EMAIL PROTECTED])', s2) > > >>> matchobj.groupdict()['email1'] or matchobj.groupdict()['email2'] > > '[EMAIL PROTECTED]' > > > - Paddy.
Groups are numbered left-to-right irrespective of the expression contents. I am quite happy with the names being merely apseudonym for the positional group number and don't see a problem with not allowing multiple occurrences of the same group name. I did see some article about RE's and their speed. It seems that if Pythons RE package distinguished between 'grep style' RE' and the full set of Python RE's then their are much faster and efficient algorithms available for the grep style subset. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list