Greetings, After struggling a bit with Bayes in general and trying to figure out a way to make things run a bit faster, I've done some serious digging and I want to clarify a few things before I make a mess of my Bayes DB...
I have everything currently set up to use a MySQL database. The bayes_token table is about 3GB in size and tends to be the slowest link in the system. (AWL isn't too far behind, but I think I have a viable strategy for dealing with that monster) First, some quick assumptions. Please correct me if I'm wrong. All of the bayes_ tables are directly related via the id field. bayes_token contains the actual tokens for bayesian processing and bayes_seen contains the message ids of messages bayes has already processed for tokens, presumably to reduce cpu usage? I *think* bayes_vars merely contains the magic data used by bayes, and I have no idea what bayes_expire is for. Am I correct thus far? Now, given that, I can directly map my users to an entry in bayes_vars and identify their "id". With that, I can purge non-existant users from the system. Simple enough. Now, for other users, can I trust the last_expire field in bayes_vars and formulate something to force-expire at periodic intervals based on that value? I realize that spamc/spamd already expire when necessary, but I think I'd rather run this on a nightly basis during off-peak hours, and serialize it so that only a single user is being expired at a time. Is that a reasonable move to reduce overall cpu usage on the system? Thanks! -- Jason 'XenoPhage' Frisvold [EMAIL PROTECTED]