Robert Fitzpatrick wrote: > On Wed, 2005-12-14 at 17:41 -0500, Matt Kettler wrote: > >>Robert Fitzpatrick wrote: >>You can improve speed by: >>1) disabling things, such as bayes URIBLS and RBLs >>2) If you are using bayes switching from DB_File BayesStore to SQL >>(recommended) >>or SDBM (fast but not well tested) will yield considerable gains. >>3) Minimizing your add-on rulesets. >> >>I'd suggest doing a little experiment and disable DNS and Bayes and see what >>happens to your scan times. >> >>/etc/mail/spamassassin/local.cf: >>use_bayes 0 >>dns_available no >> >>Be sure to restart amavis to re-parse these options. Doing this will cause >>more >>spam to skip by, but doing this will quickly tell you if one or the other of >>thee features is your problem. >> >>If scan times improve substantially, try turning bayes on and see what >>happens. >>Then turn bayes off and turn on DNS and see what happens. This will help >>determine which feature is causing your system the extra slowdown. > > > I tried dns_available no before, but that seems to have been done the > trick by disabling bayes as well. My timings are mostly 300-500 with > some 1000ms. Seems timing drops to these levels after disabling dns, but > my queue doesn't start dropping until I disable both, then wham, down > she goes...thanks. > > But now, what do I need to know about these features, is it my Berkeley > DB? And DNS seems to be fine on the server.
For DNS, well, DNS lookups are by nature slow, and SA makes a lot of them. You can improve the speed a little by running a caching nameserver on the local host, but that's not a "fix-all". You can also try lowering rbl_timeout to 10 or so to put some shorter limits on how long SA will wait for tardy DNS servers. This does cause the expense of missing some responses that may have been ready to come in from slower servers. For bayes, there's some things to look into: If you stay with DB_File (Berkeley DB) or choose to switch to SDBM: 1) if you aren't accessing databases over NFS, change your lock_method to flock. nfssafe is the default, but it's slower than flock. 2) Turn bayes_learn_to_journal on. This will greatly reduce lock contention on the bayes DB when autolearning is going on. As for bayes DB types Berkeley DB is undeniably the slowest at scanning messages. http://wiki.apache.org/spamassassin/BayesBenchmarkResults Note that "phase 2" reflects the time in seconds to scan 2000 messages using spamc. Mysql and SDBM are nearly 3 times faster at this. Since sql is well-tested, that might be a better way for you to go. SDBM has some issues. Either way, if you change DBs you'll want to do a sa-learn --backup >bayesbackup, change the bayes_store_module setting and do a sa-learn --restore bayesbackup. Unfortunately sa-learn --restore doesn't work so well on SDBM. Which is why I'd be reluctant to go this way unless you're ready to brave the unknown and jump through some hoops: http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4670