On Sat, Oct 19, 2002 at 03:01:39PM -0500, Bryant, Eric D. wrote: > I am working on implementing a spam-filtering solution for Purdue > University and SpamAssassin is one of the products at the top of my > list. I'm wondering if you guys can give me some feedback as to what > your experiences have been thus far with SA. Here are some of the > questions I have:
Bearing in mind that we're still in "learning mode" here... > 1. Can SA work well as an opt-in/opt-out solution? Depends. If you implement it at the procmail level, sure. That's not the way we're doing it; we have a mix of Unix, Exchange, and who knows what all else servers; and users using Netscape, Outbreak, Pine, Mutt, Sun Mailtool, etc. Mix of clueful and clue-impaired users. You definitely want a spamc/spamd setup, or your system is going to spend a lot of time loading perl and re-compiling the same regular expressions over and over. > 2. What kind of false positive % should I expect? Regular email, few. I don't have an actual percent. The problematic false positives have been industry newsletters that are very spammy-looking, to marketroids who go nonlinear when their Most Wonderfully Valuable Newletter gets defiled with a "spam" tag. CNN news gets tagged a lot, and a "Raves" mailing list gets the porn tags a lot because of how those people talk. :) But those aren't business related emails, so we don't much care about them. FedEx package tracing got tagged until we whitelisted FedEx. We have users in the Far East, and Chinese email gets dinged a lot because the Asian character set encoding looks like lots of all caps and exclamation points. I ended up writing my own rule to subtract about nine points from the score if the charset was one of the Asian ones. (Sub-optimal with all the Chinese spam. It'd be nicer if the PLING and YELLING rules didn't operate on non-Roman character sets. > 3. Maintainability: Does SA require a lot of maintenance > on a day to day basis? A lot of tweaking at first. It's pretty trouble-free now. I run a cron job to shut down and restart spamass-milter once an hour, because every once in a while (less than once a week) it just goes into a blank stare and passes everything. > 4. How well does it perform at large sites? (We process around > 5-700,000 emails a day) We're running spamassassin on two dedicated Sun E250s (single-cpu 400mhz) on the order of 20,000 good emails and 12,000 spams per day per server. The servers don't seem overloaded. > 5. What MTA do you recommend? We use sendmail/spamass-milter. I'm not really satisifed with spamass-milter; I'd really like a milter that talks directly to spamd. (And one which is more robust than spamass-milter.) > The design I'm looking at is a gateway solution that our users > can opt-in to. For the ones who opt-in, we'll create a > separate junk-mail folder for them that their quarantined mail > will be sent to instead of their usual inbox. Has anyone here > implemented a similar design to this? Our design is to scan all incoming email at the gateway, before it is distributed to the many scattered mail servers. With this design there isn't a good way to allow users to opt in our out - they do not have Unix logins on the filter servers, and many of them don't know what a Unix login is. Those who complained they wanted off, we added a "whitelist_to" for them, but this doesn't work for mailing lists; whitelist_to whitelists one for those who are in the "To" or "Cc" headers, it doesn't look at the envelope addressing. -- Mike Van Pelt email: [EMAIL PROTECTED] phone: 408-433-4282 Pager: 800-533-4559 or email [EMAIL PROTECTED] or web www.skytel.com, pin 5334559 ------------------------------------------------------- This sf.net email is sponsored by: Influence the future of Java(TM) technology. Join the Java Community Process(SM) (JCP(SM)) program now. http://ads.sourceforge.net/cgi-bin/redirect.pl?sunm0002en _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk