MRAB wrote:
On 25/11/2010 14:40, Jean-Michel Pichavant wrote:
Hy guys,
I'm struggling matching patterns ending with a comma ',' or an end of
line '$'.
import re
ex1 = 'sumthin,'
ex2 = 'sumthin'
m1 = re.match('(?P<something>\S+),', ex1)
m2 = re.match('(?P<something>\S+)$', ex2)
m3 = re.match('(?P<something>\S+)[,$]', ex1)
m4 = re.match('(?P<something>\S+)[,$]', ex2)
print m1, m2
print m3
print m4
<_sre.SRE_Match object at 0x8834de0> <_sre.SRE_Match object at
0x8834e20>
<_sre.SRE_Match object at 0x8834e60>
None
My problem is that m4 is None while I'd like it to match ex2.
Any clue ?
Within a character set '$' is a literal '$' and not end-of-string, just
as '\b' is '\x08' and not word-boundary.
Use a lookahead instead:
>>> re.match('(?P<something>\S+)(?=,|$)', ex1)
<_sre.SRE_Match object at 0x01719FA0>
>>> re.match('(?P<something>\S+)(?=,|$)', ex2)
<_sre.SRE_Match object at 0x016937E0>
thanks, it works that way.
By the way I don't get the difference between non capturing parentesis
(?:) and lookahead parenthesis (?=):
re.match('(?P<something>\S+)(?:,|$)', ex2).groups()
('sumthin',)
re.match('(?P<something>\S+)(?=,|$)', ex2).groups()
('sumthin',)
JM
--
http://mail.python.org/mailman/listinfo/python-list