On 09/24, David Bennett wrote: > It occurred to me that a sender that is paying their way into my inbox > is almost certainly sending me junk mail. A little research in my > inbox and it turns out to be right on the money. All stuff that I > didn't want.
I'm very curious what exactly your statistics looked like. I'll point you to the spamassassin Rule QA stats that are publicly available: > # commercial buy-in whitelists (most likely junk) > score RCVD_IN_BSP_TRUSTED 0.500 > score RCVD_IN_BSP_OTHER 0.500 > score RCVD_IN_BONDEDSENDER 0.500 > score HABEAS_ACCREDITED_COI 0 0.5 0 0.5 > score HABEAS_ACCREDITED_SOI 0 0.25 0 0.25 > score HABEAS_CHECKED 0 0.1 0 0.1 I don't see any of the above in the current spamassassin rules. What version of spamassassin are you running? Anything before 3.3.0 is very much not recommended. Ah yes, all but RCVD_IN_BONDEDSENDER were replaced with RCVD_IN_RP_CERTIFIED in version 3.3.0: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6247 And it looks like RCVD_IN_BONDEDSENDER was replaced by RCVD_IN_BSP_OTHER and RCVD_IN_BSP_TRUSTED some time over four years ago: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5476 I'm guessing you're not actually getting hits on any of these six, and just added them based on an article that hasn't been updated in four years? > score RCVD_IN_IADB_VOUCHED 0 0.2 0 0.2 > score RCVD_IN_IADB_DOPTIN 0 0.4 0 0.4 > score RCVD_IN_IADB_ML_DOPTIN 0 0.6 0 0.6 http://ruleqa.spamassassin.org/?daterev=20110924-r1175130-n&rule=%2FRCVD_IN_IADB MSECS SPAM% HAM% S/O RANK SCORE NAME WHO/AGE 0 0 0.0117 0.000 0.46 0.00 RCVD_IN_IADB_VOUCHED 0 0 0.7806 0.000 0.66 0.00 RCVD_IN_IADB_DOPTIN 0 0 0 0.500 0.45 0.00 RCVD_IN_IADB_ML_DOPTIN Hit ZERO out of 362,124 spams. Also hit a pretty insignificant amount of ham (non-spam). > score RCVD_IN_DNSWL_LOW 0 0.1 0 0.1 > score RCVD_IN_DNSWL_MED 0 0.4 0 0.4 > score RCVD_IN_DNSWL_HI 0 0.8 0 0.8 http://ruleqa.spamassassin.org/?daterev=20110924-r1175130-n&rule=%2FDNSWL MSECS SPAM% HAM% S/O RANK SCORE NAME WHO/AGE 0 0.0003 1.8893 0.000 0.75 0.00 RCVD_IN_DNSWL_HI 0 0.0224 25.6371 0.001 0.86 0.00 RCVD_IN_DNSWL_MED 0 0.0376 12.0356 0.003 0.79 0.00 RCVD_IN_DNSWL_LOW 0 0.2090 21.8867 0.009 0.66 0.00 RCVD_IN_DNSWL_NONE 25.6% of ham hits RCVD_IN_DNSWL_MED. So you're adding a score of 0.4 to a quarter of your ham, when that rule is only hitting 0.02% of spam (81 out of 362,124 spams). And that's just one of the three dnswl rules you're scoring as bad. I have pretty graphs of dnswl stats over time here: http://www.chaosreigns.com/dnswl/ (Chrome renders that badly, firefox renders it well, the non-standardization pains me.) The two at the bottom are spam vs. ham numbers in the mass-check corpora, not specific to dnswl. I assure you, if there were a test that was causing spam to get through, that wasn't still worth running because a vastly overwhelming majority of the emails it hit were ham (theoretically reducing false positives, which is more important than missing a few spams), spamassassin developers would be very interested to hear about it, and remove it. If you have that kind of information, please do provide it. -- "If you are not paranoid... you may not be paying attention." - j...@creative-net.net, on an IDPA mailing list http://www.ChaosReigns.com