> Actually, it didn't.  The assertion is that if someone else hadn't seen 
> this exact message first, then SA wouldn't have caught it.

No, the assertion is that if someone else hadn't seen prior abuse from
the sending host first (not this exact message), then SA wouldn't have
caught that particular message. That assertion happens to be true for
the blacklists, and true for BAYES as well since it would have had to
have seen headers (since the payload is vastly different) that look like
this sending host in the recent past and been told that it was SPAM.

> 
> The PBL (which isn't spamtrap fed, it's collected from ISP published 
> and/or contributed data) would have caught this based upon issues that 
> have nothing at all to do with this message, and most likely nothing at 
> all to do with this current round of spam.  It would be based upon the 
> host provider's policy that this host shouldn't send email to the internet.

Which means, some time, in the past, for whatever reasons that
particular IP address did something against someone's policy to end up
on that list. The important part being "in the past".

> Similarly, the SPAMCOP listing is most likely not related to _this_ 
> message.  It is more likely an ongoing abuse issue, so the fact that the 
> host fed a spamtrap at spamcop at some point in the past does not mean 
> that they were "lucky to catch this message".  The odds are that the 
> SPAMCOP listing has nothing to do with this message.

Spamcop automatically delists IP addresses over time, to be relisted
someone/something has to report new abuse. If you happen to receive the
message before anyone has reported the new abuse, well it won't be listed.

> I would make the same characterization of BAYES.  You don't have to see 
> a specific message in the past in order for BAYES to catch it. 
> Therefore, you're not depending upon "luckily not being the first person 
> to see a given message".

Explain how BAYES will have any matching tokens to work on if its from a
fresh, never before seen by your system, zombie and there's no message
body other than the attachment? All you have to work with is headers
which you've never seen before and MIME boundaries which you've never
seen before.

> Just resting upon BAYES, BOTNET, and PBL, you're not "lucky to have 
> caught the message because you're a late receiver".  You've caught the 
> message due to a combination of policy, misuse, and historical 
> characteristics of spam in general being used to train your system.

All of which needs prior examples/reporting of messages similar to the
one you're trying to detect, that's what "historical characteristics of
spam" means.

Reply via email to