*-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
>* [score: 0.0000]
This indicates a mistrained database, which means you have trained too
many
spams or spam-like messages (commercial messages) as ham.
Proper training of spams should help. Just keep your spam (and optionally
ham) corpora for retraining in case you would drop the database.
I also recommend to abstain from training commercial mail (notices from
e-shops, companies you done business with etc) as ham, unless they
generate
BAYES_999 score and you want it lower. I often train them as spam so
those
give uncertain BAYES_50 result.
On 14.02.23 23:05, Alex wrote:
Is there any ability to distinguish a legitimate newsletter from a spam
newsletter?
Very hard.
That's why I recommend not to train newsletters unless you know you/users
want them and they produce BAYES_99 result.
In other words, if I train emails from Forbes or Washington Post as ham,
then train similar newsletter emails from other other providers that are
more suspect, will bayes still be able to distinguish Forbes and WP as ham?
The problem is that if I avoid training newsletters or bulk email
altogether, then I'm also left with spam newsletters still only hitting
bayes50.
If you only do this for Forbes or Washington Post, bayes will likely be able
to distinguish other newsletters, if you train those as spam.
I'm actually in a situation now where Forbes and WP newsletters are being
marked as spam, so considering retraining, but wondering what approach/best
practices I should be following.
This should be safe. There are many types of newsletters, the problem would
only be if you started training them as ham unless you really know they are
welcome.
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
WinError #99999: Out of error messages.