Excellent. I am in agreement. I've sent a raw list of all the urls in the rules to Chris Santerre wish a promise that one I find some time I'll write up some perl code to clean up and form rules out of them.
Anyone have any resources-optimization documentation for regexp in Perl? Greg On Thu, 2003-12-04 at 16:11, Scott A Crosby wrote: > On Thu, 04 Dec 2003 11:43:30 -0800, Greg Webster <[EMAIL PROTECTED]> writes: > > > Seems like it would be much better to simplify and shorten these rules > > with better regexp. > > > > Samples: > > > rawbody BigEvilList_22 > > /\b(?:agnitum\.com|ahamembership\.com|aicpa-eca\.org|aic > > pa\.org|aih01\.com|ai\.hitbox\.com|AIRMARCH\.COM|AIRSHADE\.COM|ajc\.com|akss\.or > > g|albuminfo\.org|alertquotes\.com|alfy\.com)\b/i > > describe BigEvilList_22 Generated BigEvilList_22 > > If the rules look like this (abc|aef|agh), then you should get greater > performance factoring the 'a' out of the expression. a(bc|ef|gh) > Because this means it can bail out fast if the string doesn't start > with an $a$. There might be an optimization in the re engine to > autodetect this, but doing it manually won't hurt. > > Also doing additional factoring may be a win: > > hotbox|hoturls|hotgyrls|hotlemons|hotstocks|honestmerchangs|happymerchants > > --> > > h(ot(box|urls|gyrls|lemons|stocks)|onestemerchangs|appymerchants) > > Factor out the h so that it can do a prefix-reject quickly, and then > factor out the 'ot' so that it won't check 'hox' against 'hotbox' > .. 'hotstocks'. > > > Scott -- Greg Webster - [EMAIL PROTECTED] In-Touch Software Corporation Ph: (604)278-0515 - Fax: (604)608-3112 ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk