On 11/12/2017 11:07 AM, micah wrote:
Axb <axb.li...@gmail.com> writes:
On 11/12/2017 05:35 PM, micah wrote:
David Jones <djo...@ena.com> writes:
I am interested in seeing the bayes info in the database, because it was
created years ago
Spam changes all of the time so I train mine daily and manually expire
mine after about a month. Depending on your recipients, number of
mailboxes, and mail flow, you may be fine with not training that often
but I don't think tokens from years ago are going to be very accurate on
current mail flow.
A large list of whitelist_auth entries with well-trained Bayes and you
can bump up the BAYES_* scores with nice results.
How do you deal with a large user base with bayesian databases? It seems
like having a shared one just gets useless fast, but allocating an
individual database to each user is quite a hassle as well.
Servicing +40k users I use a central Redis based Bayes db in autolearn
mode. Works for me.
I'm dealing with about 3x the users, I don't use auto-learn mode but let
them train mistakes, but the whole thing is total garbage now and has to
be reset again. Now it thinks with 99.9% probability that pgp encrypted
emails are spam.
I recommend against letting end users control the Bayes training of a
single global DB without being moderated.
--
David Jones