On 11/30/2011 03:59 AM, Martin Gregorie wrote:
> On Tue, 2011-11-29 at 14:22 -0800, Adam Katz wrote:
>> You might want to consider Regexp::Assemble for your tool, though
>> that would require using perl. This would cause your man page's
>> example rule to result in something like this:
>> 
>>        body     __AU0 /(?i-xsm:\balt[123]\b)/
>>
>> rather than your script's *much* slower:
>>
>>        body     __AU0 /\b(alt1|alt2|alt3)\b/i
>>
> Interesting idea. Currently my system's performance seems 'adequate',
> considering I'm running SA on an 866 mHz P3 box with 512 MB RAM:
>                 Min                     Avg      Max
> Scan times:     0.9 (   3401 bytes)     4.0    128.3 (  72858 bytes)
> Msg sizes:     2258 (    1.8 secs )   10474   507533 (    6.2 secs )
> Messages:      2032
> 
> What sort of speed-up would Regexp::Assemble provide? 
> How would that compare with compiling the portmanteau.cf file?

Great question.  I do not have an answer.

How much optimization does re2c provide?  I am under the impression all
it does is convert text-based PCREs to C/C++ code of some sort, which
fully(?) mimics the original regexp's logic, implying that optimization
before compilation matters a lot.

I popped into irc://freenode.net#regex to ask, but this is apparently
too archaic a question.  Maybe somebody will have an answer in time.  (I
am not motivated enough to create an impromptu benchmark suite myself.)

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to