MRAB wrote:
Gabriel Rossetti wrote:
Hello everyone,
I am trying to write a regex pattern to match an ID in a URL only if
it is not a given ID. Here's an example, the ID not to match is
"14522XXX98", if my URL is "/profile.php?id=14522XXX99" I want it to
match and if it's "/profile.php?id=14522XXX98" I want it not to. I
tried this:
>>> re.search(r"/profile.php\?id=(\d+)(?<!14522XXX98)",
"/profile.php?id=14522XXX98").groups()
('14522XXX9',)
which should not match, but it does, then I tried this :
[snip]
How can '(\d+)' be capturing '14522XXX9'? '\d' matches only digits!
:-), yes, I had replaced the digits for the example (originally longer, etc)
Anyway, your basic problem is that it initially matches '14522XXX98',
but then the lookbehind rejects that, so it backtracks and releases the
last character, giving '14522XXX9', which is not be rejected because
'14522XXX9' isn't '14522XXX98'.
Try putting a '\b' after the '\d+' to reject partial IDs.
That did it, thanks a lot, I would never have found that.
Gabriel
--
http://mail.python.org/mailman/listinfo/python-list