I get a significant amount of spam that comes through mailing lists that I am legitimately subscribed to, either they are the administration emails asking me if I want to approve the "email" or not, or they are messages that make it through the list.
These messages are either hitting ALL_TRUSTED, because they come from mailing lists on my networks, or are tagged with a clear untrusted-relays list. In otherwords, I've got my trusted_networks setup so that SA knows about networks that I trust to be sending legitimate email (they are not spam originators), but obviously spam gets through, but the spam comes from hops previous to these networks. If I understand things properly, because I've got these setup in my trusted_networks, then these previous hops will be checked in RBLs, so the spam is more detectable. For example, the debian servers do send some spam to me, but the Received: headers in the emails are correct, so if the server's address is in trusted_networks, then SA will look up the address debian got the email from in RBLs. What I am unsure of is if I am poisoning my bayes by reporting these messages that make it through as spam. Should I be just deleting them? The tokens that are legitimate that will end up as collateral damage are going to be the list footers, the list administration messages, and potentially other pieces. I'm hoping I can identify why my bayes database is so bad (it thinks everything is BAYES_00 now), and if this is why I will want to change my training behavior. thanks, micah
