On Wed, 11 Jan 2017 09:29:51 +0100 Matus UHLAR - fantomas wrote: > >> On 10.01.17 10:48, Emin Akbulut wrote: > >> >Recently we receive spam messages and SA cannot block them. > [deleted] > >> >Message source: > >> >http://pastebin.com/nnN0jGw8 > > >On Tue, 10 Jan 2017 10:43:40 +0100 Matus UHLAR - fantomas wrote: > >> clear case of mistrained BAYES causing message being marked as ham. > >> you just have to re-train such spams as spam, it may take some time > >> (not very long) until it starts hitting properly. > > On 10.01.17 14:13, RW wrote: > >The pastebin example was auto-learned as ham, it may be hard to > >counter this with manual training. > > depends... I found out proper trainning can fix quite fast
Since manual training unlearns before it relearns, it's feasible to undo all the damage, but it's difficult to do that outside of a single user database. If you don't catch them all, you aren't fixing it, you are just working around the damage. > >bayes_auto_learn_threshold_nonspam should be set lower. > > I agree, and would set that to -0.1 max. However this requires network > checks on, since there are nearly no rules other than network and > bayes with negative score. And some of those are arguably pay-to-spam lists. IMO there's no good way to autolearn ham unless you are prepared to write enough local rules to positively identify it. It should be seen as a last resort. If you are in a position to train manually then IMO autotraining is more trouble than it's worth, except perhaps augmenting manual training with something like: bayes_auto_learn_on_error 1 bayes_auto_learn_threshold_nonspam -1000 This lets Bayes do some useful spam learning in real-time, without much risk of mistraining.