On 01/28/2015 04:38 PM, Reindl Harald wrote:

is AFAIK relevant in context of sa-learn to not re-train the same
messages again and again - and it has it's own bugs becaue for a few
messages it contains random parts of the message itself, fire sa-learn
on the whole corpus would add these messages each time to "bayes_toks"

see two example snippets below
hence it is that large here

-rw------- 1 sa-milt sa-milt 5,4K 2015-01-28 16:34 bayes_journal
-rw------- 1 sa-milt sa-milt 1,3M 2015-01-28 16:12 bayes_seen
-rw------- 1 sa-milt sa-milt  40M 2015-01-28 16:33 bayes_toks
-rw------- 1 sa-milt sa-milt   98 2014-08-21 17:47 user_prefs
_________________________________________________

something here does NOT make sense

1.3 MB of seen against 40MB tokens.

someone please correct me if I'm wrong:

afaik, this probably means you've deleted bayes_seen so bayes has lost it's record of what it has processed so it will relearn stuff you already fed it.

Also, a 40MB tokens DB file will not exactly help your speed.

if you don't want to use Redis then at least use SDBM which is way faster.

local.cf:

bayes_store_module           Mail::SpamAssassin::BayesStore::SDBM

and restore/relearn your corpus




Reply via email to