Wolfgang Rohdewald wrote:
Hi,
I want to match a string only if a word (C1 in this example) appears
at most once in it. This is what I tried:
re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
('C1b1b1b1 b3b3b3b3 C1', '')
re.match(r'(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
('C1',)
but this should not have matched. Why is the .*? behaving greedy
if followed by (?!.*C1)? I would have expected that re first
evaluates (.*?C1) before proceeding at all.
I also tried:
re.search(r'(.*?C1(?!.*C1))','C1b1b1b1 b3b3b3b3
C1C2C3C4').groups()
('C1b1b1b1 b3b3b3b3 C1',)
with the same problem.
How could this be done?
You're currently looking for one that's not followed by another; the
solution is to check first whether there are two:
>>> re.match(r'(?!.*?C1.*?C1)(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
re.match(r'(?!.*?C1.*?C1)(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
AttributeError: 'NoneType' object has no attribute 'groups'
--
http://mail.python.org/mailman/listinfo/python-list