On 2014-04-25 17:55, Chris Angelico wrote:
On Sat, Apr 26, 2014 at 2:30 AM, Robin Becker <ro...@reportlab.com> wrote:
Whilst translating some javascript code I find that this
A=re.compile('.{1,+3}').findall(p)
doesn't give any error, but doesn't manage to find the strings in p that I
want len(A)==>0, the correct translation should have been
A=re.compile('.{1,3}').findall(p)
which works fine.
should
re.compile('.{1,+3}')
raise an error? It doesn't on python 2.7 or 3.3.
I would say the surprising part is that your js code doesn't mind an
extraneous character in the regex. In a brace like that, negative
numbers have no meaning, so I would expect the definition of the regex
to look for digits, not "anything that can be parsed as a number". So
you've uncovered a bug in your code that just happened to work in js.
Should it raise an error? Good question. Quite possibly it should,
unless that has some other meaning that I'm not familiar with. Do you
know how it's being interpreted? I'm not entirely sure what you mean
by "len(A)==>0", as ==> isn't an operator in Python or JS. Best way to
continue, I think, would be to use regular expression matching (rather
than findall'ing) and something other than dot, and tabulate input
strings, expected result (match or no match), what JS does, and what
Python does. For instance:
Regex: "^a{1,3}$"
"": Not expected, not Python
"a": Expected, Python
"aa": Expected, Python
"aaa": Expected, Python
"aaaa": Not expected, not Python
Just what we'd expect. Now try the same thing with the plus in there.
I'm finding that none of the above strings yields a match. Maybe
there's something else being matched?
The DEBUG flag helps to show what's happening:
>>> r = re.compile('.{1,+3}', flags=re.DEBUG)
any None
literal 123
literal 49
max_repeat 1 4294967295
literal 44
literal 51
literal 125
When it's parsing the pattern it's doing this:
. OK, match any character
{ Looks like the start of a quantifier
1 OK, the minimum count
, OK, the maximum count probably follows
+ Error; it looks like the '{' was a literal
Trying again from the brace:
{ Literal
1 Literal
, Literal
+ Repeat the previous item one or more times
3 Literal
} Literal
--
https://mail.python.org/mailman/listinfo/python-list