On Sun, 2002-03-03 at 21:16, Michael Moncur wrote:
> NEGATIVE SCORES that weren't indended to be:
> (probably by now most of these are just bad rules and should be set to zero)

Setting these to 0 without introducing new nonspam-identifying rules to
replace them will greatly (very greatly) increase your false positive
rate.  Be warned.  These might not have been intended to be nonspam
identifier, but they do end up working that way somewhat effectively.

> NEGATIVE SCORES that are perhaps a bit too negative:
> 
> score EGP_HTML_BANNER                -6.039
> score IN_REP_TO                      -5.029
> score PGP_SIGNATURE                  -5.054
> 
> (I'm sure these were scored correctly, but any of them would be very easy for a
> spammer to add to a message to defeat SpamAssassin.)

Yes, I was a little hesitant about those.  As you say, it wouldn't be
terribly hard for a spammer to artificially introduce those into mail. 
I think for 2.11 we can leave them like that though, and monitor what
spammers do in response.  Keep your eyes open folks!

> My only other concern is that a few scores might be a bit high - for example,
> CTYPE_JUST_HTML is at 4.459. This works for me, but I thought some people were
> having false positives on this rule recently. It wouldn't hurt to wait and see
> what people report, though.

This one also concerned me.  There are a couple of other rules which
also seem to have migrated scores over time too.  Without a more
sophisticated mass-check system where it logs the "sent date" of each
message or something and tries to project a curve (which would start to
get very mathematically complicated quite fast in the GA), it seems like
it'd be somewhat hard to deal with this problem well.  For now, I think
we should just be aware of the problem, and monitor for false positives.

C

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to