> Hello, thanks for the post. Firstly, you are wrong about performance of my > computer - I dont have supercomputer. I didnt run 10 000 000 messages > through spamc/spamd. In fact the number is 100 000 000 and it means the max. > size of message I run through spamc/spamd(notice that the number is behind > -s parametr, s as SIZE). The result about 85 minutes is for about 17000 > messages (354MB). The average is 3,33 sec per message.
That number seems pretty high. I'm not experienced enough in the general deployment of SA to say anything definite, but can only contribute numbers and hints from our own system. We use amavisd-new which doesn't spawn SA but has it running all the time, thus saving lots of time in that area. Amavisd/postfix/SA can be configured to offer a lot of parallelism and can thus take full advantage of available system resources. Currently we have 32 parallel processes running on a rather small machine (2 cores, 3 GB RAM), and our average per message is around 1.5 second. If you need to improve performance, I suggest you start looking at the machine. Do you have a lot of iowait? Faster disks or look at dividing access between multiple drives. Do you have swapping? More memory. Do you have constant high cpu usage? More CPUs. Then start looking at the timing reports (I don't know if these are provided by SA or amavisd, so you might not have them in your setup). Each and every mail through the system has a timing report logged so you can see exactly how much time each step of the process took. It looks like this: Aug 5 00:01:53 post amavis[30559]: (30559-07) TIMING-SA total 1438 ms - parse: 1.60 (0.1%), extract_message_metadata: 35 (2.5%), get_uri_detail_list: 4 (0.3%), tests_pri_-1000: 13 (0.9%), tests_pri_-950: 1.54 (0.1%), tests_pri_-900: 1.55 (0.1%), tests_pri_-400: 33 (2.3%), check_bayes: 31 (2.2%), tests_pri_0: 1280 (89.0%), check_dkim_adsp: 109 (7.6%), check_spf: 40 (2.8%), poll_dns_idle: 35 (2.4%), check_dcc: 525 (36.5%), check_razor2: 492 (34.2%), check_pyzor: 0.25 (0.0%), tests_pri_500: 28 (1.9%), learn: 23 (1.6%), get_report: 1.45 (0.1%) Here you can see that check_dcc and check_razor2 are pretty expensive, because they have to query external servers. We are a low traffic site (less than 50k messages a day) and that's not a problem for us. But if you have a high volume of traffic and DNS lookup dependent tests takes a long time, you might consider adding a local DNS server to your setup. Look at http://www.spamtips.org/2011/07/spamassassin-why-run-your-own-dns.html for further information. -- Lars