On Wed, Feb 09, 2005 at 11:12:23AM +0100, Arvinn Løkkebakken wrote:
> Has anyone measured the difference in performance?
> 

There are benchmark results available for the single server with mysql
on a localhost. NOTE: that the DBM tests were done with a local
database with lock_method flock, and not over NFS.

http://wiki.apache.org/spamassassin/BayesBenchmark

I've done multi-system benchmarks but not in a controlled enough
environment to be able to publish the results.  Lets just say that
they are good, with the usual expected network latency overhead added
on.

If someone wanted to offer up a testbed of multiple machines I'd be
willing to spend the time testing various configurations.

> 
> I guess it's obvious that the setup will perform better with bayes in 
> SQL when several spamd servers are in use, but what if it's just 1 spamd 
> server?

The short story is, for SQL learning is slower than DBM and scanning
is faster.  Plus you lose, well at least push into the DB layer a lot
of the lock contention issues you get with DBM.

> How does the sice of the database matter, e.g. will SQL perform better 
> when the database is bigger?

This will largely depend on your DB tuning.  I've heard of
multi-gig and multi-hundred gig bayes dbs.  IMO (of course I'm a
little biased) the DB format is very efficient (fixed length rows and
all that) and very fast.  I've personally never run with a DB more
than a couple hundred megs.

Will MySQL handle a large DB better than DB_File?  I have no hard
evidence to back it up, but it should, right?

Michael

Attachment: pgpYCMWkGIn54.pgp
Description: PGP signature

Reply via email to