On Sat, 27 Apr 2013, Martin Gregorie wrote:

Question to JH: I can see that portmanteau rules on high volume sites
would benefit from the (?=x) optimisation, but so would a lot of rules
that use regexes containing alternations. So, is there any possibility
of slotting it into the SA rule compiler rather than me implementing it
as part of my portmanteau rule generator when doing the latter would
limit its use to a subset of rules that might benefit.

The internal compiler within SA itself, or the external tool that compiles rules?

The former is just essentially generation of perl code for the rule evaluation and reusing that. It doesn't do any analysis or rule-combining that would allow leverage of this.

I have no familiarity with the latter, so I couldn't comment.

It seems to me that only-the-fly generation of (?=x)(xa|xb|xc) or x(a|b|c) optimizations of alterations is a fairly non-trivial task given something other than alternations built from a sorted set of substrings or sub-REs - and in fact, now that I think about it, that sort of optimization is properly the job of an optimizer for the RE parser built into perl... the programmer writing the REs shouldn't have to worry about doing something (fairly) obvious like that. Think of it the same way you think about the query plan optimizer in a database engine.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  ...we talk about creating "millions of shovel-ready jobs" for a
  society that doesn't really encourage people to pick up a shovel.
                             -- Mike Rowe, testifying before Congress
-----------------------------------------------------------------------
 330 days since the first successful private support mission to ISS (SpaceX)

Reply via email to