Re: Repeating assertions in regular expression

2012-01-03 Thread Devin Jeanpierre
> Put simply, it doesn't occur often enough to be worth it. The cost > outweighs the potential benefit. I don't buy it. You could backtrack instead of failing for \b+ and \b*, and it would be almost as fast as this optimization. -- Devin On Tue, Jan 3, 2012 at 1:57 PM, MRAB wrote: > On 03/01/20

Re: Repeating assertions in regular expression

2012-01-03 Thread MRAB
On 03/01/2012 09:45, Devin Jeanpierre wrote: \\b\\b and \\b{2} aren't equivalent ? This sounds suspiciously like a bug! Why the wording is "should never" ? Repeating a zero-width assertion is not forbidden, for instance : import re re.compile("\\b\\b\w+\\b\\b") <_sre.SRE_Pattern obje

Re: Repeating assertions in regular expression

2012-01-03 Thread Devin Jeanpierre
> \\b\\b and \\b{2} aren't equivalent ? This sounds suspiciously like a bug! > Why the wording is "should never" ? Repeating a zero-width assertion is not > forbidden, for instance : > import re re.compile("\\b\\b\w+\\b\\b") > <_sre.SRE_Pattern object at 0xb7831140> I believe this

Repeating assertions in regular expression

2012-01-03 Thread candide
The regular expression HOWTO (http://docs.python.org/howto/regex.html#more-metacharacters) explains the following # -- zero-width assertions should never be repeated, because if they match once at a given location, they can obviously be matched an infinite number o