On 7/30/2010 10:58 AM, Adam Moskowitz wrote: > Background: SpamAssassin version 3.2.5 running on Perl version 5.8.8 on > CentOS release 5.2 (Final) -- all set up for me by my sysadmin. Everything > works fine when using all the defaults. However . . . > > I want to use spamassassin's per-user whitelisting as part of some mail > processing I'm doing. I'm dealing with a lot of messages (potentially > over 100,000), but doing it one-at-a-time (and I can't easily change > that). spamassassin takes a long time to load and run (1.5 - 2 seconds > per message), and it's performing over 50 tests per message even though > for this purpose I need only 1 or 2 of those tests. > > Can I arrange to load/run only the tests I need? If so, how? > > I've read what I believe are the relevant docs but I can't find what > would let me do this. > > I can't (and don't want to) modify the system set-up, but I can create > private, custom versions/copies of config files, rules, rules > directories, whatever; I'm even willing to accept that I may have to > manually apply updates to these private files when the system updates > spamassassin. However, I can't figure out what in these private config > files would be used to say "here's my (pared-down) directory of rules" > or "run only these tests" or however this problem can be solved. > > Can someone please help?
How are you running SA? What is sending mail to it? If you are calling 'spamassassin' for each message, you should switch to using spamc/spamd. If you are using spamc/spamd, you would have to have a separate instance of spamd for each set of rules. How many spamd processes do you currently have? How much memory is on the machine? My guess is that it would be counter-productive to try to run multiple rule sets. Obviously, you can remove any rules that are not needed to speed things up a bit. I would suggest that you analyze your memory usage and figure out the optimal number of child processes to run without getting into swap. Then make sure your MTA or whatever is sending messages to SA is set up to handle that many processes. If you truly are limited to processing one message at a time, then you are probably already close to your performance wall. SA runs best when it can multitask processing multiple messages at once. -- Bowie -- Bowie