On 8 Dec 2003, Scott A Crosby wrote: > On Mon, 08 Dec 2003 16:43:15 -0500, Matt Kettler <[EMAIL PROTECTED]> writes: > > > Or *, to catch more than one obfuscating character.. > > > > ie: V...i..a.gr..a > > > > As I suggested in my email, there's lots of combinations that spammers > > can do to avoid the original rule. There's also lots of ways to > > construct the rule to get a broader hit-base, at the expense of > > greater processing time. > > In theory, this isn't that much additional matching time, especially > with an automata. In practice though, these sorts of rules will kill > performance because Perl cannot apply the literal optimization, > especially if they're applied widely. (There's more than just Vxxxxx > -- most of the phrase rules need this sort of treatment.) > > Scott
Scott, If it's a bounded wild-card (".{0,6}") as opposed to unbounded (".*") is it less of a hit? (IE reasonable thing to do). Are there any reasonably simple ways to do this with out killing things? (EG .? == OK, .* == BAD, .{0,n} == acceptable, for small values of 'n') Are there any studies of the Perl matching engine for efficiency and rules-of-thumb? Dave -- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{ ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk