At 9:31 PM -0400 08/11/2013, Alex wrote:
Can you post this rule again so we can investigate?
# HTML comment gibberish
# Looks for sequence of 100 or more "words" (alphanum + punct
separated by whitespace) within HTML comment
rawbody HTML_COMMENT_GIBBERISH /<!--\s*(?:[\w'"?!.:;-]+\s+){100,}\s*-->/im
describe HTML_COMMENT_GIBBERISH lots of spammy text in HTML comment
score HTML_COMMENT_GIBBERISH 0.001
regexpal says my rule matches the comment. SA doesn't agree.
How do you find the SPAMMY_URI_PATTERNS rule is performing? It seems
very prone to FPs.
It's performing quite well for me... I haven't seen any FPs on it.
The patterns are based on specific spam templates... one looks for
/outl and /outi URIs, the other is /land/ + /unsub/ + /report/ ...
these URIs have to occur in combination. You are correct that it has
the potential for FPs but I haven't seen any so far.
Why is there no BAYES score?
I ran this test through the root account which does not have a Bayes
DB, so there's no Bayes score. There was a Bayes score on the
original email, which was Bayes50 just like every other one of these
types of spams (no real text, just a spammy image which SA isn't
decoding).
Are you using sqlgrey? If not, it's incredible and you should try it.
I have not implemented any sort of greylisting yet. I can't use
sqlgrey because I don't use postfix... my server runs sendmail. I'm
sure there are some good sendmail-compatible greylisters but I
haven't tried them yet... I'm a bit worried about legitimate email
getting bounced. I'm sure I'll get to it in due course, though...
Thanks.
--- Amir