Michelle Konzack wrote: > SpamAssassin works already, but what must I do if I like to use ClamAV > over network with 4-12 scanning machines?.
Hi Michelle, a definite answer would require a better knowledge about your environment. Also I'm not a courier-mta user. However here are some generic suggestions that may help you. First of all, ClamAV is generally faster and much less resource hungry than SpamAssassin. The obvious choice is to set ClamAV first, SA next. Second, avoid middleware generated overhead whenever possible. As an example if your MTA can interface natively with SA and clam, then don't use amavis. If it can't then just use amavis as a glue and disable all its checks. Of course both suggestions imply that you don't care about amavis functionalities and just use it as a glue. Since I've discussed amavis, please also be aware that, under the most common config, it will cause each message to be basically scanned twice: each attachment separately first, then the full message (with all the attachments). If you can just let clamav scan only the full message. Third, carefully balance latency and performance. You can control the number of scanning threads in clamd via the MaxThreads directive. Performance wise, the optimal number of threads is something between N and N*2 (with N+1 or N+2 being likely the absolute best) where N is the total number of cpu cores. Please note however that when all the scan threads are busy, further requests will be queued and possibly refused. You certainly want to have enough threads available so that scan requests from the mta are not refused or delayed for too long. At the same time avoid an excessive amount of threads as this only wastes resources. Fourth, avoid IO as much as possible. Despite the fact that clamav mostly bottlenecks on the cpu, disk IO can very badly impact the performance of clamd in busy environments. Besides reading the files to be checked, clamd may internally generate quite a few temporary files. Under small load these files are very short lived and never really touch the disk, hence no time is spent on IO. However, under heavy load, the kernel may decide to actually commit them to the disk (or to the journal) in order to free some memory. This increases iowait and negatively affects the scan performance. If you have the choice, pick a box with more ram and slower disks and use tmpfs for the clamd tempdir and the mta (or amavis) scan spool (not the mail spool directory!). Back to your specific issue, clamd can scan streams from the network. All you have to do is to set up a tcp socket instead of (or in addition to) the unix socket. Then you need a clamd client that can properly communicate to a remote clamd. Since clamav-milter is not an option in your case, the most obvious choice is probably clamdscan via a tiny courier perlfilter script or via amavisd. Finally if you have more clamd's than mta's then you may want to fairly distribute (load balance and fail over) scan requests to all the available scanners. Again you have several options here ranging from writing a piece of perl filter to do manage the scan requests, to routing mails to a second line of mta's (or amavisd's) in a (possibly dns based) round robin fashion. HtH, --acab _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml