On 11/30/2011 03:59 AM, Martin Gregorie wrote: > On Tue, 2011-11-29 at 14:22 -0800, Adam Katz wrote: >> You might want to consider Regexp::Assemble for your tool, though >> that would require using perl. This would cause your man page's >> example rule to result in something like this: >> >> body __AU0 /(?i-xsm:\balt[123]\b)/ >> >> rather than your script's *much* slower: >> >> body __AU0 /\b(alt1|alt2|alt3)\b/i >> > Interesting idea. Currently my system's performance seems 'adequate', > considering I'm running SA on an 866 mHz P3 box with 512 MB RAM: > Min Avg Max > Scan times: 0.9 ( 3401 bytes) 4.0 128.3 ( 72858 bytes) > Msg sizes: 2258 ( 1.8 secs ) 10474 507533 ( 6.2 secs ) > Messages: 2032 > > What sort of speed-up would Regexp::Assemble provide? > How would that compare with compiling the portmanteau.cf file?
Great question. I do not have an answer. How much optimization does re2c provide? I am under the impression all it does is convert text-based PCREs to C/C++ code of some sort, which fully(?) mimics the original regexp's logic, implying that optimization before compilation matters a lot. I popped into irc://freenode.net#regex to ask, but this is apparently too archaic a question. Maybe somebody will have an answer in time. (I am not motivated enough to create an impromptu benchmark suite myself.)
signature.asc
Description: OpenPGP digital signature