Simon Byrnand <[EMAIL PROTECTED]> wrote: > > * Are you learning a proportionate amount of ham as well ? [...] > * Bayes doesn't "learn" particular messages, it learns the statistics > of the words used in the messages. [...]
None of this addresses the original posted concern: When you feed one or more messages to sa-learn, it should report: "Learned from X messages", where X is some number greater than zero.
Well all reference to this "original concern" was deleted from the quoting in the messages that I read and replied to....its a very busy mailing list and I don't read all messages before replying to one...
If sa-learn reports "Learned from 0 messages," it means that it didn't put anything into the Bayes database. The main reason that I have seen for this, is that SA thinks that it has already seen the message before, meaning that the Message-ID in the mail is already stored in the bayes database.
Thats the usual cause, but its not the only one, as far as I know. There are also other criteria that may stop it from learning it. For example I believe (and the developers might have to correct me here) if the message has insufficient material to generate a worthwhile number of tokens the message will not be learnt.
However, the original poster already stated that he used "sa-learn --forget" to try to remove any traces of the message, and yet sa-learn STILL continues to say "Learned from 0 messages", meaning that it does not want to learn from the particular message.
So there must be something else stopping it - I thought I remembered someone (Theo ?) asking him to try running it in debugging mode which should give a reason why it wasn't learnt, but I don't remember seeing a response to this...
I had several spams not too long ago in which all of them used the same Message-ID. SA refused to learn from any of them except the first one. What can be done to combat that? Of course, duplicate Message-ID's are a violation of RFC's, but spammers don't care. :)
I've wondered that before too.....but been afraid to voice it on the list in case spammers pick up on it.... but now that you've mentioned it.....:)
What happens if all spammers start sharing a handfull of message ID's, will that make Bayes useless against trying to learn their messages ? Good question indeed...
Regards, Simon
------------------------------------------------------- This SF.Net email sponsored by: Parasoft Error proof Web apps, automate testing & more. Download & eval WebKing and get a free book. www.parasoft.com/bulletproofapps1 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk