If you run spamassasin with -D bayes -t xxx  2>debug.log

in debug.log you will see all the "tokens" the bayes system extracts
from the headers and you will probably find a lot of them related to
mailing lists.

If you teach SA that those tokens are spam and they are present both
in WP or Forbes, their emails will be flagged. It's normal.

If you want you can use bayes_ignore_header to ignore some headers.



On 2/15/23, Matus UHLAR - fantomas <uh...@fantomas.sk> wrote:
>>>*-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
>>> >*      [score: 0.0000]
>>>
>>> This indicates a mistrained database, which means you have trained too
>>> many
>>> spams or spam-like messages (commercial messages) as ham.
>>>
>>> Proper training of spams should help. Just keep your spam (and
>>> optionally
>>> ham) corpora for retraining in case you would drop the database.
>>>
>>> I also recommend to abstain from training commercial mail (notices from
>>> e-shops, companies you done business with etc) as ham, unless they
>>> generate
>>> BAYES_999 score and you want it lower.  I often train them as spam so
>>> those
>>> give uncertain BAYES_50 result.
>
> On 14.02.23 23:05, Alex wrote:
>>Is there any ability to distinguish a legitimate newsletter from a spam
>>newsletter?
>
> Very hard.
>
> That's why I recommend not to train newsletters unless you know you/users
> want them and they produce BAYES_99 result.
>
>
>>In other words, if I train emails from Forbes or Washington Post as ham,
>>then train similar newsletter emails from other other providers that are
>>more suspect, will bayes still be able to distinguish Forbes and WP as
>> ham?
>
>>The problem is that if I avoid training newsletters or bulk email
>>altogether, then I'm also left with spam newsletters still only hitting
>>bayes50.
>
> If you only do this for Forbes or Washington Post, bayes will likely be able
>
> to distinguish other newsletters, if you train those as spam.
>
>>I'm actually in a situation now where Forbes and WP newsletters are being
>>marked as spam, so considering retraining, but wondering what
>> approach/best
>>practices I should be following.
>
> This should be safe. There are many types of newsletters, the problem would
>
> only be if you started training them as ham unless you really know they are
>
> welcome.
>
> --
> Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
> Warning: I wish NOT to receive e-mail advertising to this address.
> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
> WinError #99999: Out of error messages.
>

Reply via email to