Hello Larry, Tuesday, November 25, 2003, 12:34:57 PM, you wrote:
LG> Attached is a custom rule file. It has been working rather well and LG> I will be increasing the score from 0.5 to 1.0. The cf file also has LG> some rules looking for words obfuscated by pipes. They have been LG> working well also. FYI, My masscheck results with your rules (run against my corpus of 58,857 emails). Final number on each line is what I would initially score them based on these hits (per my algorithm posted at http://www.exit0.us/index.php/RM_RuleScoring -- most sites should probably score these lower, and I would probably want to do a 2-pass or 3-pass GA on these to refine the scores myself). MY_RBDY_PDS_1P3 -- 375s / 22h -- 1.163 MY_RBDY_PDS_1P4 -- 365s / 5h -- 1.608 MY_RBDY_PDS_1P5 -- 210s / 3h -- 1.700 MY_RBDY_PDS_1P6 -- 165s / 2h -- 0.550 MY_RBDY_PDS_1P7 -- 88s / 0h -- 1.880 MY_RBDY_PDS_1P8 -- 121s / 4h -- 1.302 MY_RBDY_PDS_2P2 -- 168s / 14h -- 1.112 MY_RBDY_PDS_2P3 -- 105s / 45h -- 0.228 MY_RBDY_PDS_2P4 -- 311s / 14h -- 1.207 MY_RBDY_PDS_2P5 -- 56s / 7h -- 0.700 MY_RBDY_PDS_2P6 -- 161s / 8h -- 1.179 MY_RBDY_PDS_2P7 -- 89s / 5h -- 1.148 MY_RBDY_PDS_2P8 -- 4s / 5h -- 0.067 or -0.100 MY_RBDY_PDS_3P1 -- 200s / 15h -- 1.125 MY_RBDY_PDS_3P2 -- 173s / 25h -- 6.654 MY_RBDY_PDS_3P3 -- 179s / 58h -- 0.303 MY_RBDY_PDS_3P4 -- 74s / 15h -- 0.463 MY_RBDY_PDS_3P5 -- 195s / 12h -- 1.150 MY_RBDY_PDS_3P6 -- 43s / 5h -- 0.717 MY_RBDY_PDS_3P7 -- 3s / 5h -- 0.050 or -0.125 MY_RBDY_PDS_3P8 -- 42s / 49h -- 0.084 or -0.114 MY_RBDY_PDS_4P1 -- 285s / 32h -- 0.864 MY_RBDY_PDS_4P2 -- 417s / 21h -- 1.190 MY_RBDY_PDS_4P3 -- 259s / 82h -- 0.312 MY_RBDY_PDS_4P4 -- 160s / 26h -- 0.593 MY_RBDY_PDS_4P5 -- 56s / 17h -- 0.311 MY_RBDY_PDS_4P6 -- 7s / 0h -- 0.700 MY_RBDY_PDS_4P7 -- 3s / 12h -- 0.023 or -0.300 MY_RBDY_PDS_4P8 -- 2s / 0h -- 0.200 MY_RBDY_PDS_5P1 -- 84s / 21h -- 0.382 MY_RBDY_PDS_5P3 -- 99s / 464h -- 0.021 or -0.464 MY_RBDY_PDS_5P5 -- 81s / 12h -- 0.623 MY_RBDY_PDS_6P6 -- 99s / 464h -- 0.021 or -0.464 MY_HDR_PDS_1P5 -- 140s / 0h -- 2.400 MY_HDR_PDS_2P1 -- 244s / 3h -- 1.610 MY_HDR_PDS_2P4 -- 176s / 13h -- 1.126 MY_HDR_PDS_3P2 -- 308s / 9h -- 1.308 MY_HDR_PDS_3P3 -- 607s / 528h -- 0.115 MY_HDR_PDS_3P5 -- 108s / 0h -- 2.080 MY_HDR_PDS_3P8 -- 73s / 0h -- 1.730 MY_HDR_PDS_4P3 -- 481s / 519h -- 0.093 or -0.108 MY_HDR_PDS_4P4 -- 114s / 13h -- 0.814 MY_HDR_PDS_4P5 -- 82s / 0h -- 1.820 MY_HDR_PDS_5P1 -- 171s / 0h -- 2.710 MY_HDR_PDS_6P1 -- 159s / 0h -- 2.590 MY_HDR_PDS_6P2 -- 122s / 9h -- 1.122 MY_BDY_PIPE_S233S -- 17s / 0h -- 1.170 MY_BDY_PIPE_S23S -- 35s / 0h -- 1.350 MY_BDY_PIPE_S23C -- 17s / 0h -- 1.170 MY_BDY_PIPE_S24S -- 42s / 0h -- 1.420 MY_BDY_PIPE_S34P -- 0s / 0h -- 0.100 MY_HDR_PIPE_S233S -- 0s / 0h -- 0.100 MY_HDR_PIPE_S23S -- 0s / 0h -- 0.100 MY_HDR_PIPE_S23C -- 0s / 0h -- 0.100 MY_HDR_PIPE_S24S -- 0s / 0h -- 0.100 MY_HDR_PIPE_S34P -- 0s / 0h -- 0.100 Two ham scored 5.0: . 5 file=../massham/ham.0307.5360234 rules=MY_RBDY_PDS_2P3, MY_RBDY_PDS_2P4, MY_RBDY_PDS_3P1, MY_RBDY_PDS_3P2, MY_RBDY_PDS_3P3, MY_RBDY_PDS_3P4, MY_RBDY_PDS_4P3, MY_RBDY_PDS_4P5, MY_RBDY_PDS_5P3, MY_RBDY_PDS_6P6 . 5 file=../massham/ham.0307.5383420 rules=MY_RBDY_PDS_2P3, MY_RBDY_PDS_2P4, MY_RBDY_PDS_3P1, MY_RBDY_PDS_3P2, MY_RBDY_PDS_3P3, MY_RBDY_PDS_3P4, MY_RBDY_PDS_4P3, MY_RBDY_PDS_4P5, MY_RBDY_PDS_5P3, MY_RBDY_PDS_6P6 Negative scores: So far when I use rules like this I've generally been scoring them positive, often with a minimum of 0.100, even when they hit more ham than spam. My philosophy has been "This is often used/seen in spam. If it's spam, there should be enough rules hit to flag it as spam. If it's ham, the few rules hit wouldn't matter since they wouldn't reach my spam threshold." So, note to spammers: you can't fake your way past my system by using those "more ham than spam" combinations -- they won't get you negative scores. However, I'm thinking that rules like MY_RBDY_PDS_6P6 -- 99s / 464h -- 0.021 or -0.464 which hit several ham for each spam, maybe it would be useful to score those negatively in my system, as a way of avoiding FPs when using rule sets like these. How do others feel about this type of question? Bob Menschel ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk