Don't overrate Bayes. Don't focus solely on a bullet-proof highly available clustered or replicated database. If the Bayes database is gone, only one check is gone! All the others are still there.

For my mail content, the real filtering power today come from the network checks such as url-blocklists, content-checksums (razor/dcc) and open-relay block lists. Focus on making these additional tests work.

For Bayes, use a central SQL database on one server that is used by all your MTA's, and keep it simple. Make a disaster recovery concept for the database machine and for the rebuild of an empty SA Bayes database. This could be very fast. Don't backup the Bayes token data. You wrote that you expect 500.000 messages per day. If you use Bayes auto-learning, an empty central Bayes database is refilled to a usable state from current messages in only a few hours. This is probably faster than a cumbersome restore process.

regards,
Alex

Reply via email to