Yes, this would depend heavily on how much optimization / transform the RE
compiler does. After all, Leif's argument would seem to work against using
a Trie, but that is in fact faster because of pattern matching
re-organization.

On Tue, Jan 15, 2019 at 11:29 AM Walt Karas <wka...@oath.com.invalid> wrote:

> I don't know the details of the algorithm for generating a state
> machine to match a PCRE.  I was wondering if there is an optimization
> so that matching an alteration of subpatterns is faster than than
> matching each subpattern sequentially.
>
> On Tue, Jan 15, 2019 at 11:21 AM Leif Hedstrom <zw...@apache.org> wrote:
> >
> >
> >
> > > On Jan 15, 2019, at 10:09 AM, Walt Karas <wka...@oath.com.INVALID>
> wrote:
> > >
> > > When we remap we go through the list of regexs looking for a match.
> > > What if we took all the regexes, removed subpattern captures, and
> > > created one giant pattern with alternation and capture of the regex
> > > for each rule, like:
> > >
> > > (regex1)|(regex2)|...|(regexN)
> > >
> > > We could then just step through the array of subpattern capture
> > > offsets to find the matching rule.  Then do a second match against
> > > just the original rule regex to get any captures it contains.  Would
> > > this work, and would it be faster?
> >
> >
> > That could potentially be a lot slower, no? Imagine you have a remap
> today with
> >
> >         regex_map <regex1> ...
> >         regex_map <regex2> ...
> >         regex_map <regex3> ...
> >         regex_map <regex4> …
> >
> >
> > If you know what you are doing, and have some luck, you could order
> these such as the most common regex is matched early.  And first match
> wins, and therefore, you could stop evaluating regexes early. Even in an
> average case, you’d only evaluate half of the regexes, right?
> >
> > But if you combine all these into one regex, you essentially have to
> evaluate all regexes all the time, regardless of order and regardless if
> something has matched.
> >
> > — Leif
> >
> >
>


-- 
*Beware the fisherman who's casting out his line in to a dried up riverbed.*
*Oh don't try to tell him 'cause he won't believe. Throw some bread to the
ducks instead.*
*It's easier that way. *- Genesis : Duke : VI 25-28

Reply via email to