Craig R Hughes writes: > So it's just because the GA could get away with setting it to 0.921 > -- in practice it's a clear sign of nonspam, and we should just fix > it at -2.0, which I've done on both branches now.
Okay. In HEAD, I made the rule less apt to be abused which is just as well since we're hard-coding the score negatively. > [craig@belphegore masses]$ fgrep FROM_AND_TO_SAME freqs > 3574 2877 697 FROM_AND_TO_SAME > > So it's not a bad rule -- occurs 5 times as frequently in spam as nonspam. > > FROM_AND_TO_SAME is triggered in 0.2% of the false positives in the corpus > FROM_AND_TO_SAME is triggered in 3.1% of the false negatives in the corpus > > which makes the score the GA calculated make perfect sense. Okay, as you probably already noticed, I removed FROM_AND_TO_SAME earlier this morning (bug 456). Let me reopen that bug and I'll see if I can find a way to improve the rule before I add it back. For me, only 42% of the hits are spam although I'll grant you that it's not part of any false positives, so I'll add it back either way. Dan _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas - http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk