Jason Frisvold <xenophage0 at gmail> writes: > I've been investigating some recent slowness issues with our mail > servers and I noticed that the spamassassin database is getting rather > large. We process approximately 300,000 mails a day (or more). The > bayes_token database is over 1.8 Gig at the moment. (Actually, 1.8 Gig > for the data, and 1.3 Gig for the index)
If you have fewer than 65,000 accounts you could halve the size of the id columns. Those folks *not* storing per-user Bayes statistics -- i.e. with only one line in bayes_vars -- could take the id column down to one byte, a TINYINT. ALTER TABLE bayes_token MODIFY id SMALLINT UNSIGNED NOT NULL; ALTER TABLE bayes_expire MODIFY id SMALLINT UNSIGNED NOT NULL; ALTER TABLE bayes_seen MODIFY id SMALLINT UNSIGNED NOT NULL; ALTER TABLE bayes_vars MODIFY id SMALLINT UNSIGNED NOT NULL; The last three won't recover much space, but table indexes should be kept the same type and size. Also, try the following. Since you are keeping Bayes per user, your spam_count and ham_count probably don't get very high so you can take those down to SMALLINTs. SELECT MAX(ham_count), MAX(spam_count) FROM bayes_token; ALTER TABLE bayes_token MODIFY spam_count SMALLINT UNSIGNED NOT NULL, MODIFY ham_count SMALLINT UNSIGNED NOT NULL; That took 25% off my bayes_token data file and index. Since MySQL likes to keep indexes in VM, I think it was worthwhile. -- Ard