Le 08/03/2011 23:49, Erik de Castro Lopo a écrit :
> Wietse Venema wrote:
> 
>> If you must match a very large numbers of patterns, you need an
>> implementation that transforms N patterns into one deterministic
>> automaton. This can match 1 pattern in the same time as N patterns.
>> Once the automaton is built (which takes some time) it is blindingly
>> fast. An example of such an implementation is flex.
> 
> Is there a limit the the pattern length in the pcre tables?
> 
> If not, it would be possible to convert this (3 only, but could be
> hundreds or even thousands):
> 
>    /^([0-9]{1,3}\.){4}\.dsl\.dynamic\.eranet\.pl$/
>    /^([0-9]{1,3}\.){4}\.dynamic\.snap\.net\.nz$/
>    /^([0-9]{1,3}\.){4}\.nat\.umts\.dynamic\.eranet\.pl$/
> 
> to this:
> 
>    
> /^([0-9]{1,3}\.){4}\.(dsl\.dynamic\.eranet\.pl|dynamic\.snap\.net\.nz|nat\.umts\.dynamic\.eranet\.pl)$/
> 
> and that should reject "1.1.1.1.not-found" in 1/3 the time of the
> three original regexes while also matching quicker than the original.


your speculations are wrong. /(joe|foo|bar)/ isn't /3 times faster than
individual tests. but before all, "premature optimisation is the root of
all evil". one should not convert readable stuff to unmaintainable
hieroglyph without measuring the real benefits.


> 
> Obviously, a conversion from the first three to the optimised version
> has to be done mechanistically to avoid errors.
> 

if it should be done, then it should be done inside the implementation.

> Cheers,
> Erik

Reply via email to