Andy Jezierski writes: > Are there any instructions in setting up the Bayes DB using a Redis > server?
Yes, in release notes (currently also in build/announcements/PROPOSED-3.4.0.txt in svn). Pretty much exactly as you already have it. > I've installed the server, took the sample config options and added them > to local.cf > > bayes_store_module Mail::SpamAssassin::BayesStore::Redis > bayes_store_module_additional Mail::SpamAssassin::Util::TinyRedis > bayes_sql_dsn server=127.0.0.1:6379;password=spamd;database=2 > bayes_token_ttl 21d > bayes_seen_ttl 8d > bayes_auto_expire 1 > use_bayes 1 > bayes_auto_learn 1 > > Performed a redis-cli -n 2 FLUSHDB > > Did a backup of one of my mysql bayes databases and am attempting to do a > restore to the new system. Good. > Looks like the redis server keeps chewing up swap space until it runs out, > then the redis server terminates. > > Running on FreeBSD 9.2 perl 5.18-5.18.2 redis server 2.8.4 > Any ideas? Depends very much on the number of tokens you have in you SQL database. Mine (cca 1000 users) keeps hovering at about 1 M tokens (and just keeps few very recent 'seen' entries), resulting in redis server using under 300 MB of memory. $ redis-cli -n 2 keys 'w:*' | wc -l 1091475 $ redis-cli -n 2 keys 's:*' | wc -l 1324 May be worthwhile to purge old tokens from SQL first, before creating a backup. Also, it is safe to ditch the entire 'seen' set of records, it's not worth transfering them to a new database. If this still gives unreasonable number of tokens, it may be worth decimating a set - just preserving a random subset of tokens. Another option is to just start from an empty database. With a reasonable set of other rules, network tests and autolearning on, the required 200 samples of ham and 200 of spam can be quickly reached on a busy server. During initial learning consider decreasing score for BAYES_00 and BAYES_99 rules. Note that bayes_token_ttl and bayes_seen_ttl have no effect on entries loaded from a backup dump, they are all given a 'current' timestamp (with some random offset so that they will not expire at exactly the same time). But for a steady-state, with these *_ttl settings you can control how many items are kept in a database on the average. Axb writes: > what does sa-learn --dump magic say (when using mysql) Good idea to check this first. > my Redis > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 5728 root 20 0 5355m 5.1g 1020 S 1.3 37.1 711:19.45 redis-server > > sa-learn --dump magic > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 16481050 0 non-token data: nspam > 0.000 0 5690858 0 non-token data: nham > bayes_token_ttl 864000 > bayes_seen_ttl 2d A biggie! Btw, with redis db the number of tokens actually in a database may not be directly related to the number of learned and reported tokens bacause of the automatic expiration performed by redis server (according to bayes_token_ttl) - unlike other bayes back-ends where purging is done explicitly by SpamAssassin. Mark