Bart Schaefer wrote: BS> These look suspicious: BS> BS> score: ASCII_FORM_ENTRY 0.036 -> -1.660
Changed back to 0.5 -- as mentioned in previous message, this is triggering on the sourceforge-appended footers on mailing list mails. BS> score: BUGZILLA_BUG -2.000 -> 0.921 Moved to the right section of the scores file, and score reverted to -2.0 BS> score: DATE_MISSING 0.248 -> -2.140 I've set the score for this one to 2.0 because I think the GA was working with bad input data on this rule (see prior message) BS> score: EXCUSE_16 1.345 -> -0.721 I think this should be left as is. The big score change is probably because I've had a huge volume of correspondence with attorneys and accountants in the last few months. This is probably more representative of the email stream of the general population than was the previous corpus. BS> score: FORGED_HOTMAIL_RCVD 0.530 -> -0.356 This was arbitrary -- no FORGED_HOTMAIL in the current corpus at all. I'll reste to 0.5 BS> score: FROM_AND_TO_SAME 0.877 -> -2.071 I think this needs to be fixed to something like 2.0, but I'll leave it as is pending the resolution of the bugzilla bug related to being in ones own AWL. BS> score: FROM_NAME_NO_SPACES 0.500 -> -0.114 Not a huge swing -- I think I'll believe the GA BS> score: GREEN_EXCUSE_1 3.116 -> -2.019 This one is a little odd. I've reset it to 3.116 BS> score: INTL_EXEC_GUILD 0.781 -> -0.039 Also odd. Reset to 0.781 BS> score: MONEY_BACK 1.489 -> -0.239 BS> score: MONEY_MAKING 2.490 -> -0.687 Both reset to 1.500 and 2.500 BS> score: MSGID_CHARS_WEIRD 1.500 -> -2.178 This is a bad rule. Needs to realize that [] exist in valid MSGIDs. Score left as is until rule is fixed. BS> score: NO_REAL_NAME 0.632 -> -1.068 I think this should probably be somewhere around +0.5, probably low because of sysadmin-email bias in the corpus. Resetting to 0.5 BS> score: X_NOT_PRESENT 0.500 -> -1.920 I think this is probably right actually. There may be some element of sysadmin bias in the corpus, but will need someone to argue in favor of a +ve score for this rule before I reset it. BS> (How does BUGZILLA_BUG keep creeping back into the GA?) Don't know -- probably cos I failed to move it into the "do not evolve these" section. I've made that move now so it should be OK in the future. BS> score: ASKS_BILLING_ADDRESS 2.627 -> -0.152 BS> score: BE_AMAZED -0.260 -> 4.202 BS> score: CTYPE_JUST_HTML 3.154 -> 1.665 BS> score: LINES_OF_YELLING 0.453 -> -0.036 BS> score: LINES_OF_YELLING_3 -1.518 -> 0.478 BS> score: MAILTO_TO_REMOVE 1.341 -> -1.669 BS> score: MAILTO_WITH_SUBJ -0.310 -> 1.900 BS> score: MIME_NULL_BLOCK 0.157 -> -0.975 BS> score: SLIGHTLY_UNSAFE_JAVASCRIPT -0.794 -> 0.693 BS> score: SUBJ_ALL_CAPS 1.933 -> -0.054 BS> score: SUPERLONG_LINE -0.374 -> 0.384 BS> score: TO_BE_REMOVED_REPLY -2.150 -> 3.985 BS> score: TO_UNSUB_REPLY -1.996 -> 3.366 BS> score: TRACKER_ID -4.215 -> 4.332 BS> score: X_ESMTP 1.000 -> -1.662 I think many of these rules changed between 2.2 and 2.3, so the scores have changed to reflect the more accurately-drafted rules. BS> How did these get exactly 1.0? Not represented in the corpus at all? BS> BS> score: FORGED_RCVD_TRAIL absent -> 1.000 BS> score: FROM_ADDRESS_EQ_REAL absent -> 1.000 BS> score: TO_ADDRESS_EQ_REAL absent -> 1.000 Yes, none of any of these in the corpus. The RCVD_TRAIL was added late, after everyone had already run mass-check. The others possibly too. BS> Amusing anecdote in case you get this far: I recently had to whitelist BS> several friends because they were discussing the minutes of the local BS> school board meeting. The budget numbers triggered the Nigerian scam BS> rules and the sex-ed discussion set off the PORN rules. There's a case BS> where rule intersection analysis might have been helpful -- there's BS> probably not much Nigerian porn priced at millions of dollars. You'd be surprised. It depends on whether you're buying in bulk ;) C _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas - http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk