I was checking the relative usefulness of the per-user Bayes databases for my users and came up with the following confusing information.
When I look at the overall stats, bayes does pretty good: RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 6 BAYES_99 26754 4.19 44.49 67.00 3.06 But when I do it for only our domain (which is where all the manual training happens), it hits less ham, but less spam as well: RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 8 BAYES_99 4649 3.29 33.41 54.64 0.20 Just my personal email address (which is trained aggressively) gets very few ham hits (partly because I lowered my threshold to 4.0), but less spam than overall: RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 5 BAYES_99 1643 3.08 27.05 65.72 0.08 And then when I modify sa-stats to exclude our domain, I find that our customers (who are trained exclusively with autolearn) seem to do better than us: RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 6 BAYES_99 22105 4.44 47.83 70.35 4.11 Based on these results, it almost seems like the more training Bayes gets, the worse it does! Are these anomolies just an artifact of sa-stats relying on SA to judge ham and spam properly? Can these numbers be trusted at all if my users don't reliably report false negatives and positives? -- Bowie