On May 17, 2002 06:22 pm, Daniel Quinlan wrote:
>     FROM_AND_TO_SAME - I mail myself notes

Agreed, or sometimes I sent to myself when I have a BCC mailing

>     VERY_SUSP_RECIPS and VERY_SUSP_CC_RECIPS - people use large
>       internal To and Cc all the time

This isn't just for internal stuff.  "Regular" users often get "Fwd: 
Friends"-type emails with tons and tons of addresses (often hotmail) in them.  
These are almost always tagged as spam, even with my "common MSN footers" and 
"Forwarded messages" negative-scoring tests.

>     X_PRIORITY_HIGH - I'm less certain about this one, I guess people
>       aren't shy about using this internally?

At the company I work for the PHBs love sending email with the priority set 
high.  They absolutely love this feature.  <sigh>

> (3) Why are these negative?
>     MAILTO_WITH_SUBJ
>     HTML_WITH_BGCOLOR
>     SLIGHTLY_UNSAFE_JAVASCRIPT
>     OPPORTUNITY
>     ALL_CAPS_HEADER
>     [ and more ]

Probably because after running the GA over the non-spam corpus it was found 
that these were NOT good indicators of spam.  the first two I can definately 
see being problematic tests for spam.

>     Seems like they've earned negative scores even though they are
>     clearly spam detectors.  Rules that aren't effective, but are not
>     intended to detect legitimate mail should be scored at 0.0, not
>     negatively.

I'm guessing that they had to be scored negatively because the GA marked other 
tests as very good indicators of spam but the combination of those tests and 
the ones marked negatively were found in non-spam email.  :-)

Regards,
Andrew

_______________________________________________________________

Hundreds of nodes, one monster rendering program.
Now that’s a super model! Visit http://clustering.foundries.sf.net/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to