> On Mon, 18 Jun 2018 06:13:06 -0600 @lbutlr wrote:
>> I have a script that runs when a mail is moved out of the Junk
>> folder to pass the mail through sa-learn --ham,
I think this is what the dovecot's Antispam plugin does:
https://wiki2.dovecot.org/Plugins/Antispam
and maybe ImapSieve:
https://wiki2.dovecot.org/HowTo/AntispamWithSieve
On 18 Jun 2018, at 08:47, RW <rwmailli...@googlemail.com> wrote:
> Whether this is the Dovecot plugin or something local it's a poor
> way of training Bayes. You're training on SA errors not Bayes
> errors. Most imperfect Bayes results don't translate into
> misclassifications.
still better than nothing. And it helps us solve the main problem -
misclassifications.
On Mon, 18 Jun 2018 10:13:04 -0600 @lbutlr wrote:
I’m not sure what you’re trying too say here/ Certainly SA does
misclassify mail as spam at times, ...
Training the messages as ham is useful.
On 18.06.18 22:58, RW wrote:
The problem is that, unless there is something badly wrong, a typical
single user account wont generate enough FPs and FNs for a properly
trained database. I found that Bayes's identification of ham improved
until I'd trained about 1500 ham, but I wouldn't expect to get anything
like 1500 SpamAssassin FPs in a lifetime.
It's not even proper train-on-error because it's training on
SpamAssassin misclassifications and not correcting Bayes's own
errors. It allows Bayes to go uncorrected until it results
in an FP or FN.
Of course, training BAYES_999 as spam and BAYES_00 as ham won't help change
their score, but still can push possible BAYES_20 to BAYES_00 and BAYES_99 to
BAYES_999.
You can work around the plugin's deficiencies by using autotraining or
doing some additional training, but then the plugin is of limited
relevance.
Of course, both autotraining AND the fixing errors are required to
work properly.
Unfortunately I have seen spam repeatedly trained as ham, because of some
negative scoring rules and too high autolearn threshold.
Same can happen in opposite way. having way to fix those manually helps
users.
IMO the plugin is best left to statistical filters like DSPAM.
isn't dspam dead?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I intend to live forever - so far so good.