Re: Spam Assassin Load Balancing

Paolo Cravero Tue, 08 Jan 2008 01:58:07 -0800

Thomas Ledbetter wrote:

First of all: we're running amavisd-new, not plain spamc/spamd anymore.

We used to have N servers each running its own spamd deamons, so with separateBayes/AWL DB.


I have not understood how many machines run spamc and how many spamd.

With a rounb robin policy on a hardware load balancer, once theconnection is routed to a specific 'worker bee', if that machine timesout, the request will fail, and the mail wont get scanned. However,more intelligent hardware load balancing setups can monitor the work oneach node, and take it out of service as necessary.

A load balancer sets as offline non-responding nodes, according to a differentlevel of checks (ICMP ping, TCP ping, service check, ...). But these checksare not in real-time, so if spamd dies during analysis the connection willdrop (or hang) and spamc will timeout. The load balancer won't restart theconnection to another node. At least not our HLB. Been there (with LDAP), donethat!

Also, when running a round-robin based cluster, is there any problemhaving a mix of machines with different performance capacities? i.e. IfI have a 10 node cluster, and 3 of the servers are much slower than theothers, will it impact performance of the cluster as a whole? Even if Ilimit the number of spamd that run to a lower value than the higherperformance machines?

What do you consider as "performance"? I think the global average analysistime (what I call "performance") will obviously be affected, to an amount thatdepends on load distribution. With a real load balancer you can use differentpriorities for each node, so to keep faster machines more busy than slower ones.

Anyway, I've seen spamd running on different hardware since 2004 and Iwouldn't say the analysis speed has been improved significantly. Just don'tlet spamd nodes swap memory to disk.


Good luck with the high-load spam fight,
Paolo

Re: Spam Assassin Load Balancing

Reply via email to