> Put simply, it doesn't occur often enough to be worth it. The cost
> outweighs the potential benefit.

I don't buy it. You could backtrack instead of failing for \b+ and
\b*, and it would be almost as fast as this optimization.

-- Devin

On Tue, Jan 3, 2012 at 1:57 PM, MRAB <pyt...@mrabarnett.plus.com> wrote:
> On 03/01/2012 09:45, Devin Jeanpierre wrote:
>>>  \\b\\b and \\b{2} aren't equivalent ?
>> This sounds suspiciously like a bug!
>>>  Why the wording is "should never" ? Repeating a zero-width assertion is
>>> not
>>>  forbidden, for instance :
>>>>>>  import re
>>>>>>  re.compile("\\b\\b\w+\\b\\b")
>>>  <_sre.SRE_Pattern object at 0xb7831140>
>> I believe this is meant to refer to arbitrary-length repetitions, such
>> as r'\b*', not simple concatenations like that. r'\b*' will abort the
>> whole match if is run on a boundary, because Python detects a
>> repetition of a zero-width match and decides this is an error.
> r"\b+" can be optimised to r"\b", but r"\b*" can be optimised to r"".
> r"\b\b", r"\b\b\b", etc, can be optimised to r"\b".
> So why doesn't it optimised?
> Because every potential optimisation has a cost, which is the time it
> would take to look for it.
> That cost needs to be balanced against the potential benefit.
> How often do you see repeated r"\b"?
> Put simply, it doesn't occur often enough to be worth it. The cost
> outweighs the potential benefit.
> --
> http://mail.python.org/mailman/listinfo/python-list

Reply via email to