Re: New spam rule for specific content

Amir 'CG' Caspi Sun, 11 Aug 2013 18:46:15 -0700

At 9:31 PM -0400 08/11/2013, Alex wrote:

Can you post this rule again so we can investigate?


# HTML comment gibberish

# Looks for sequence of 100 or more "words" (alphanum + punctseparated by whitespace) within HTML comment

rawbody HTML_COMMENT_GIBBERISH  /<!--\s*(?:[\w'"?!.:;-]+\s+){100,}\s*-->/im
describe HTML_COMMENT_GIBBERISH lots of spammy text in HTML comment
score HTML_COMMENT_GIBBERISH    0.001

regexpal says my rule matches the comment.  SA doesn't agree.

How do you find the SPAMMY_URI_PATTERNS rule is performing? It seems
very prone to FPs.

It's performing quite well for me... I haven't seen any FPs on it.The patterns are based on specific spam templates... one looks for/outl and /outi URIs, the other is /land/ + /unsub/ + /report/ ...these URIs have to occur in combination. You are correct that it hasthe potential for FPs but I haven't seen any so far.

Why is there no BAYES score?

I ran this test through the root account which does not have a BayesDB, so there's no Bayes score. There was a Bayes score on theoriginal email, which was Bayes50 just like every other one of thesetypes of spams (no real text, just a spammy image which SA isn'tdecoding).

Are you using sqlgrey? If not, it's incredible and you should try it.

I have not implemented any sort of greylisting yet. I can't usesqlgrey because I don't use postfix... my server runs sendmail. I'msure there are some good sendmail-compatible greylisters but Ihaven't tried them yet... I'm a bit worried about legitimate emailgetting bounced. I'm sure I'll get to it in due course, though...


Thanks.

                                                --- Amir

Re: New spam rule for specific content

Reply via email to