Gabriel Rossetti wrote:
Hello everyone,
I am trying to write a regex pattern to match an ID in a URL only if it
is not a given ID. Here's an example, the ID not to match is
"14522XXX98", if my URL is "/profile.php?id=14522XXX99" I want it to
match and if it's "/profile.php?id=14522XXX98" I want it not to. I tried
this:
>>> re.search(r"/profile.php\?id=(\d+)(?<!14522XXX98)",
"/profile.php?id=14522XXX98").groups()
('14522XXX9',)
which should not match, but it does, then I tried this :
[snip]
How can '(\d+)' be capturing '14522XXX9'? '\d' matches only digits!
Anyway, your basic problem is that it initially matches '14522XXX98',
but then the lookbehind rejects that, so it backtracks and releases the
last character, giving '14522XXX9', which is not be rejected because
'14522XXX9' isn't '14522XXX98'.
Try putting a '\b' after the '\d+' to reject partial IDs.
--
http://mail.python.org/mailman/listinfo/python-list