On 10.01.17 14:13, RW wrote:
>The pastebin example was auto-learned as ham, it may be hard to
>counter this with manual training.
On Wed, 11 Jan 2017 09:29:51 +0100
Matus UHLAR - fantomas wrote:
depends... I found out proper trainning can fix quite fast
On 11.01.17 14:49, RW wrote:
Since manual training unlearns before it relearns, it's feasible to
undo all the damage, but it's difficult to do that outside of a single
user database. If you don't catch them all, you aren't fixing it, you
are just working around the damage.
otoh, some part of ham/spam is got properly, you only need to train the
other part...
>bayes_auto_learn_threshold_nonspam should be set lower.
I agree, and would set that to -0.1 max. However this requires network
checks on, since there are nearly no rules other than network and
bayes with negative score.
And some of those are arguably pay-to-spam lists.
IMO there's no good way to autolearn ham unless you are prepared to
write enough local rules to positively identify it. It should be seen
as a last resort.
it's a good start that will help you in training manually :)
If you are in a position to train manually then IMO autotraining is
more trouble than it's worth, except perhaps augmenting manual training
with something like:
bayes_auto_learn_on_error 1
bayes_auto_learn_threshold_nonspam -1000
This lets Bayes do some useful spam learning in real-time, without
much risk of mistraining.
I'm afraid that bayes_auto_learn_on_error will only cause ham trained as spam
(because not hit) and vice versa, after you train your DB properly ...
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Eagles may soar, but weasels don't get sucked into jet engines.