On Wed, 3 Oct 2007, Rob Mangiafico wrote:

On Tue, 2 Oct 2007, [iso-8859-2] Micha? J?czalik wrote:
There are many. It allows you to share data between user accounts (IMHO it
doesn't make much sense to have separate bayes databases for each account,
at least they are of a 'massive' sort and users are not allowed to feed
their own spam/ham etc. - because they share mostly the same data and the
bayes is more up-to-date if one single database autolearns from many
mailboxes). It allows you to share data among several hosts. It allows
you to keep data on a remote host if you don't have enough space. Etc.

Picking up on the point of one Bayes DB in MySQL vs. individual ones for
each user, is it more effective in an ISP/host environment where you have
diverse users to have them all share one Bayes DB with autolearn, or is it
better if they each have their own Bayes data in MySQL (per user)?

We're slowly converting to mysql for bayes, and have not decided yet which
method would be best for our users and for the servers in general. Thanks.

Sorry for a late answer. Of course it's more effective. This was the major reason for me to do it. Then you have one bayes db, one autoexpire, you need space only for one db. If anything goes wrong (some disk failure, or db malfunction) you need to recreate only one db.

If you don't have any significant reason to have per-user bayes databases, then you should probably use one-for-all method.

And one more advantage - I'm not too much into SQL performance stuff, but one-for-all is probably faster, because the SQL engine doesn't have to look up for multiple (possibly thousands) different bayes databases and probably it's able to cache at least some of those bayes tokens. Remember that on a large system it's common to receive the same spam message to multiple mailboxes at one time.
--
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04

Reply via email to