Good evening, Justin, On Mon, 17 Nov 2003, Justin Mason wrote:
> >>> Are there ways to improve the performance of the checks? I ask > >>> because these URI rules are tripping on about 50-60% of my current > >>> spam - much more than the corresponding source domain blacklist rules. > > Quick speed tips: > > .* = slow > lookaheads or lookbehinds = very slow Neither are used - *phew*! > anchoring with \b = fast OK, cool. As I'm doing full domains, I'll change: uri WLS_URI_1 /0-go.org/i to uri WLS_URI_1 /\b0-go.org\b/i in the next version. Is there any way I could get SA to extract _just_ the host portion of the URI, unescape it, and lowercase it? Then I could test my domains just against that, rather than the whole URI. It would also mean I could remove the /i case insensitive search, which I'm sure isn't helping speed at all. :-) Something like: host WLS_URI_1 /\b0-go.org\b/ Oh, shoot. Yahoo redirectors would screw that up. *sigh* Perhaps we just put in a rules for the known redirectors. > anchoring with ^, $ = faster Tough to do in this case, although I know you were answering the general question of how to make regexes faster. > >Possibility 2: bound the rules. I noted that the URI for 16.com matched > >significant ham. Test for /\bdomain/ and maybe it'll run a trifle > >faster. > > yes. If you can bound at the start of the URL it'll probably be > faster still... As a general rule, I'm testing against domains; it's too easy for spammers use random hostnames more often than they do already. If I try, won't I just end up with: uri WLS_URI_1 /^http:.*\b0-go.org\b/i which puts us right back at the .* problem again? Cheers, - Bill --------------------------------------------------------------------------- "Give a man a fish and you feed him for a day; give him a freshly-charged electric eel and chances are he won't bother you for anything ever again." (Courtesy of Thomas Harris <[EMAIL PROTECTED]>) -------------------------------------------------------------------------- William Stearns ([EMAIL PROTECTED]). Mason, Buildkernel, freedups, p0f, rsync-backup, ssh-keyinstall, dns-check, more at: http://www.stearns.org Linux articles at: http://www.opensourcedigest.com -------------------------------------------------------------------------- ------------------------------------------------------- This SF. Net email is sponsored by: GoToMyPC GoToMyPC is the fast, easy and secure way to access your computer from any Web browser or wireless device. Click here to Try it Free! https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk