Jorn Argelo wrote:
I can tell from experience that MyISAM is useless when it comes to Bayes. As pointed out by Benny Pedersen, MySQL will do nothing more than waiting on table locks. A single UPDATE query will take 30-90 seconds, and even more when you are on a busy site, not to mention the load of your MySQL server is going skyhigh. Your scantimes will increase dramatically to over 2-3 minutes of a single e-mail. If you, like me, get 25.000 emails a day to process you can't afford this.

Just goes to show what you can do when you throw hardware at the problem. ;)

Our SA database bits are on one of four two-socket, quad-core Opteron machines each with 8G of RAM. The directory for the SA database is a 1.9G ramdisk; the database is dumped to a real disk daily. Table type *is* MyISAM; I've just confirmed this with mysqldump -d.

The other three machines all run SpamAssassin itself - although except for occasional peaks, any one could probably handle the load OK.

Using InnoDB is absolutely vital and you cannot use MyISAM at all from my experience. My bayes_token table is 12 million rows and increasing every day, and performance is still just fine.

What do you have for bayes_expiry_max_db_size? Is this a sitewide Bayes setup? I've got it set to 2M for sitewide use; so far that seems to be close to optimal between daily churn and accuracy.

-kgd

Reply via email to