Daniel Quinlan wrote:

DQ> Craig R Hughes <[EMAIL PROTECTED]> writes:
DQ>
DQ> >> score: BUGZILLA_BUG -2.000 -> 0.921
DQ>
DQ> > Moved to the right section of the scores file, and score reverted to -2.0
DQ>
DQ> But why is it positive?  Doesn't it mean there are good messages in
DQ> the spam corpus or the rule is not good enough?  (See my other message
DQ> on BUGZILLA_BUG.)

[craig@belphegore masses]$ fgrep BUGZI freqs
       298 0 298 BUGZILLA_BUG

So it's just because the GA could get away with setting it to 0.921 -- in
practice it's a clear sign of nonspam, and we should just fix it at -2.0, which
I've done on both branches now.

DQ> >> score: FROM_AND_TO_SAME 0.877 -> -2.071
DQ>
DQ> > I think this needs to be fixed to something like 2.0, but I'll leave
DQ> > it as is pending the resolution of the bugzilla bug related to being
DQ> > in ones own AWL.
DQ>
DQ> I'm convinced that it's a bad rule.  Only 41 out of 125 false
DQ> positives in my good mail database of 6037 messages is me.  The rest
DQ> of the 125 are other good senders using the same Bcc: trick.

[craig@belphegore masses]$ fgrep FROM_AND_TO_SAME freqs
      3574 2877 697 FROM_AND_TO_SAME

So it's not a bad rule -- occurs 5 times as frequently in spam as nonspam.

FROM_AND_TO_SAME is triggered in 0.2% of the false positives in the corpus
FROM_AND_TO_SAME is triggered in 3.1% of the false negatives in the corpus

which makes the score the GA calculated make perfect sense.


DQ> > Yes, none of any of these in the corpus.  The RCVD_TRAIL was added
DQ> > late, after everyone had already run mass-check.  The others
DQ> > possibly too.
DQ>
DQ> They're new in HEAD, that's all.

I was in the wrong directory -- thought I was reading the 2_3_0 line, but I was
actually on the head.

C


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas - 
http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to