On Thu, Feb 09, 2006 at 03:24:58PM -0000, John Hall wrote: > "Ronan" <[EMAIL PROTECTED]> wrote in message > > > > Anyone have any input on this? What would be the implications? Should it > > just be a straight translation perl -> c , or are there other factors? > > Ronan, > > Why would using pcre be quicker? Perl's regex engine is written in C as > well. Besides, there is more to SA than just matching regexes.
The most important Difference between 'grep-ing' by pcre versus perl in my opinion is the 'Startup-Time'. Starting/dynamically-linking a whole 'perl-interpreter' is a lot more Work than just starting a pcre Pattern-Engine. So if you 'just grep for Text' with a script, pcre(grep) is your friend. BUT if you need lots of dynamic libraries, use loadable Modules, and connect to networks, like spamassassin does, 'pcre' simply has nothing to compare with that. And in the case of 'spamd' the startup-phase loads only once, then there only fork children, so there should be no large startup-penalty. ONLY you should not use 'dangerous/slow perl-patterns' (avoid ambiguities, avoid remembering brackets without (?: ), limit pattern-match-lengths by not using '.*' but .{min,max}, construct easily decidable left-factored searches) As far as I remember perl does 'allow' a few more complicated (not to say convoluted) cases than pcre does, but you'll better not use them anyway in spamassassin patterns. Stucki -- Christoph von Stuckrad * * |nickname |<[EMAIL PROTECTED]> \ Freie Universitaet Berlin |/_*|'stucki' |Tel(days):+49 30 838-75 459| Mathematik & Informatik EDV |\ *|if online|Tel(else):+49 30 77 39 6600| Arnimallee 2-6/14195 Berlin * * |on IRCnet|Fax(alle):+49 30 838-75454/