Oliver Thalmann wrote: > > Hi > > are there some recommendations for tuning the size > (number of tokens) of a bayes db ? > > i.e. are there some recommendations of optimal number of > tokens, maximum recommended number of tokens, etc ? > > i currently have something like this > > sa-learn --dump magic > 0.000 0 2 0 non-token data: bayes db version > 0.000 0 359318 0 non-token data: nspam > 0.000 0 36472 0 non-token data: nham > 0.000 0 4316998 0 non-token data: ntokens > > > with autolearn thresholds at 0.15 for ham and 10.5 for spam, and the > database > isn't "autoexpire" (currently done via a cron-job every 4 weeks)
Have to set the user_prefs variable for it -- tell sa-learn at how many tokens to expire. Some of the docs listed criteria for "opportunistic" expiration: - the last expire was attempted at least 12hrs ago - bayes_auto_expire does not equal 0 - the number of tokens in the DB is > 100,000 - the number of tokens in the DB is > bayes_expiry_max_db_size - there is at least a 12 hr difference between the oldest and newest token atimes > > this makes a approx 160Mb bayes_toks file That's pretty large. > should i "expire" it more often (last expiration run was a few weeks > ago) ? Yeah, probably. According to the docs, over 5000 messages is not doing any good. If I recall, there's 15 tokens per message -- or that's the attempt -- so 150k tokens (ham, and spam) is good, which makes for about 5mb -- a bit smaller than the 160 you're running now :). http://spamassassin.rediris.es/doc/sa-learn.html#expiration Bryan > Thanks > > PS : why is there much more nspam than nham...well, currently about > 60% of our received internet email traffic is spam....what a waste of > resources) > > -- The trick is to fall and miss the ground. This happens only when someone falls and becomes so preoccupied with something else that they forget that they are falling, and therefore are no longer doing it. Flying is just a permanent state of falling, minus the hitting the ground part. - Douglas Adams http://www.wecs.com/content.htm This signature file is generated by Pick-a-Tag ! Written by Jeroen van Vaarsel http://www.google.com/search?hl=en&ie=ISO-8859-1&q=pick-a-tag ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk