Hello Peter, Thursday, April 7, 2005, 5:29:38 AM, you wrote:
PM> I have been building a new mailserver to replace my old one. PM> The new one has postfix, Cyrus-imap, anomy, spamassassin. I am trying PM> to set up the bays auto-learn stuff. Each user has a home directory on PM> the server (they can not log onto the server). I am using the Maildir PM> format. PM> Is it better to have a cron job run by a single user (say root) to do PM> the ham / spam learning for everyone, or should I run a cron for each PM> individual user. All users belong to the same company. Best, if you have the disk space for the multitude of Bayes databases, is to run ham/spam learning as each user. I'd recommend the "running constantly if I staggered it for every user," something like: - run as cron - get cycle start time - identify list of active users - for each active user - determine if anything to learn; skip to next user if not - su to that user's id - sa-learn - if not yet 30 min since start of this cycle, sleep 15 min - loop to next cycle. PM> Problem I have thought of with the latter. PM> 1. There would be approximitly 130 cron jobs running sa-learn at the PM> same time .... or it would run constantly if I staggered it for every PM> user. What kind of load will that have on my 850 with 756 MB of ram ? running constantly, staggered, will work better on that system (IMO) than allowing multiple executions at the same time. PM> Problems I have with both: PM> 1. What is the best method of obtaining the spam / ham. I have the PM> server create a spam folder for each user when the user is created. PM> spamassassin will automatically put all mail marked as spam in this PM> folder. Obviously I will use this folder to run salearn on for spam. NO. NO. NO. NO. Do not run sa-learn on automatically flagged emails. SA does this itself somewhat conservatively (though not conservatively enough -- I suggest lowering the ham auto-learn threshold). Provide instead a "missed-spam" folder and a "not-spam" folder. Have your people copy/move miscategorized emails into those, and learn from those folders. PM> 2. How often should I run sa-learn ? Users here for the most part get PM> mail in their inbox and then after reading it move it to some other sub PM> folder ... (of which everyones is different, and some have over 100). On single-domain systems I normally run it hourly. PM> Are there any downfalls to running a site wide one ? What is the best PM> method of doing this if this is a better method. Currently I plan to PM> use this to learn the spam. Does anyone see any problems. PM> (Note: this assumes it is being run as a particular user.) Some people prefer system-wide, others domain-wide, others user-specific. YMMV. Feasibility might be the more important criteria, since all three can work. Bob Menschel
