On Mon, 13 Jan 2025, Anders Gustafsson wrote:

Hi!

When collecting spam I frequently see multiple copies of the same message, but 
with different fake senders.
In this case, should I feed just one or all to Bayes?

Yes, feed all copies of verfied spam to Bayes. As it is a weighted score per token the more times it's seen the stronger its "spammyness" score. It's also possible for the messages to differ by things such as network routing headers, better to feed it all to bayes and let it get parsed/scored.

Similarly you also need to feed ham (labeled as such) to Bayes so it knows how to tell right from wrong.

Also: Is there a point in feeding such spam that is already flagged by other 
rules than Bayes and if so,
should I remove the additions that SA adds to the message? Ie: XSPAM etc?

Thanks in advance!

No need to strip out SA tags and SA added headers, the Bayes parser knows to ignore such data.


--
Dave Funk                               University of Iowa
<dbfunk (at) engineering.uiowa.edu>     College of Engineering
319/335-5751   FAX: 319/384-0549        1256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin         Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to