On 10.01.17 14:13, RW wrote:
>The pastebin example was auto-learned as ham, it may be hard to
>counter this with manual training.

On Wed, 11 Jan 2017 09:29:51 +0100
Matus UHLAR - fantomas wrote:
depends... I found out proper trainning can fix quite fast

On 11.01.17 14:49, RW wrote:
Since manual training unlearns before it relearns, it's feasible to
undo all the damage, but it's difficult to do that outside of a single
user database. If you don't catch them all, you aren't fixing it, you
are just working around the damage.

otoh, some part of ham/spam is got properly, you only need to train the
other part...

>bayes_auto_learn_threshold_nonspam should be set lower.

I agree, and would set that to -0.1 max. However this requires network
checks on, since there are nearly no rules other than network and
bayes with negative score.

And some of those are arguably pay-to-spam lists.

IMO there's no good way to autolearn ham unless you are prepared to
write enough local rules to positively identify it. It should be seen
as a last resort.

it's a good start that will help you in training manually :)

If you are in a position to train manually then IMO autotraining is
more trouble than it's worth, except perhaps augmenting manual training
with something like:

bayes_auto_learn_on_error 1
bayes_auto_learn_threshold_nonspam  -1000

This lets Bayes do some useful spam learning in real-time, without
much risk of mistraining.

I'm afraid that bayes_auto_learn_on_error will only cause ham trained as spam
(because not hit) and vice versa, after you train your DB properly ...

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Eagles may soar, but weasels don't get sucked into jet engines.

Reply via email to