Kurt Fitzner wrote:
John D. Hardin wrote:
 > But if the stated purpose of the BL is "this domain does not have a
working postmaster address" then it's unreasonable to ask them to
exclude a domain that does not have a working postmaster address, no
matter how large or popular that domain is.

My concern is the score attached to those rules by SpamAssassin.  The
purpose of SpamAssassin is to detect spam with as few false positives as
possible.  Attaching a score of 3.2 to every outgoing mail from
yahoo.com is, counterproductive.  I would even go so far as to claim
that those rules are adding more spam points to ham mail than any other
rule.

The purpose of SpamAssassin is not to punish domains without working
postmaster addresses.  It is not to act as RFC cops. It is to detect
spam.  Let's not lose sight of the goal because some BL list has gone on
a crusade to police compliance to RFC's that have lost relevance.

As far as SpamAssassin is concerned, the rule is only to detect spam,
and if that is the case, then size and popularity of the domain does
matter - the ham to spam ratio from that domain matters, and the volume
of false positives definitely matters.  Note to all:  the rule is broken.


No. The size of the domain does not matter. The volume of the doman does not matter. The popularity of the domain does not matter.

What matters is, when looking at the spam corpus vs the ham corpus, does applying that score value to messages which come from/through a host listed in RFCI help to differentiate spam from ham. The specific hosts, and their characteristics, don't matter in determining the value of _that_ rule. Nor should they.

The essential questions are: "did the message come from/through a host in that RBL?" and "given _all_ messages that come from _all_ hosts in that RBL, how accurate is that characteristic as a predictor of any random message being spam?" Notice, a specific host isn't part of either question.

You're right that the purpose of spam assassin is not to punish domains who violate RFC's. It is also not the purpose of spam assassin to reward or give exemptions to domains that are large/popular/etc. It is the purpose of spam assassin to identify spam, and in doing so it develops rules and then weights those rules according to their accuracy to the corpus. That rule has a 3.2 value because the 3.2 value is accurate to differentiating spam vs ham in the corpus. Therefore, the score is appropriate.

If you're complaining that the rule isn't actually weighted correctly _across_all_messages_from_all_hosts_ (not just messages from your pet domain(s)), then see about giving more counter examples to the team that performs that part of the determination, so that they can be part of the corpus which sets the scores.

Reply via email to