René Scharfe <[email protected]> writes:
> There could be any characters except NUL and LF between the 4096 zeros
> and "0$" for the latter to match wrongly, no? So there are 4095
> opportunities for the misleading pattern in a page, with probabilities
> like this:
>
> 0$ 1/256 * 2/256
> .0$ 254/256 * 1/256 * 2/256
> ..0$ (254/256)^2 * 1/256 * 2/256
> .{3}0$ (254/256)^3 * 1/256 * 2/256
>
> .{4094}0$ (254/256)^4094 * 1/256 * 2/256
>
> That sums up to ca. 1/256 (did that numerically). Does that make
> sense?
Yes, thanks. I think the number would be different for "^0*$" (the
above is for "0$") and moves it down to ~1/30000, but as I said,
allowing additional false success rate is unnecessary (even if it is
miniscule enough to be acceptable), so let's take the 64*64 patch.
>> So we are saying that we accept ~1/100 false success rate, but
>> additional ~1/30000 is unacceptable.
>>
>> I do not know if I buy that argument, but I do think that additional
>> false success rate, even if it is miniscule, is unnecessary. So as
>> long as everybody's regexp library is happy with "^0{64}{64}$",
>> let's use that.
>
> The parentheses are necessary ("^(0{64}){64}$"), at least on OpenBSD.
Sorry, what I wrote was merely a typo; the one from you I applied
did have the parens so we are good.
Thanks.