Hi, In-line, we do about 500K a day.
Andrew Donkin wrote:
In particular I am interested in: - how many boxes running spamd?
2 currently but only because we are under a bounce back joe job. Normally one P4 3.2 Ghz with 2 gig of ram handles the load.
- how many spamd children per box (spamd --max-children)
10 with max connections set at 250
- if Bayes is SQL, is it on the same or separate server as spamd, and
Bayes, same server as spamd
are you replicating it to balance read load?
No, our temp second box is reading from the first.
- spamc timeout (spamc -t)
The default
- rbl_timeout
2 seconds
- bayes_expiry_max_db_size
Default
- bayes_journal_max_size
Default
With autolearning on, and the default bayes_journal_max_size, the
Our auto learn is off after in initial week of training. I now manually add spam and ham into in.
We also run a force expiry nightly.
The bayes_token database is over 1.8 Gig at the moment. (Actually, 1.8 Gig for the data, and 1.3 Gig for the index)
207744000 Nov 23 15:47 bayes_seen.MYI 163404544 Nov 23 15:47 bayes_seen.MYD 22894592 Nov 24 01:00 bayes_token.MYI 36 Jan 12 13:00 bayes_expire.MYD 2048 Jan 12 13:00 bayes_expire.MYI 68 Jan 12 17:08 bayes_vars.MYD 31961270 Jan 12 17:08 bayes_token.MYD This is for approx 35k users, all in a global bayes. Regards, Rick