On Sun, 2009-02-15 at 02:05 +0100, Karsten Bräckelmann wrote:
> Lindsay, if you end up doing some benchmarking, please let us know. I
> wouldn't be surprised if you're actually the first one to do this across
> the Internet. :)
> 
Just a thought. Since getting message sizes and counts on traffic
between a client and server isn't the easiest thing to do unless they're
already instrumented to collect this information, the best approach may
be two pronged:

1) write a Perl or awk script that processes /var/log/maillog.* and
gathers message size statistics. The regex 'spamd.*bytes.$' will pick
the relevant log lines and the message size is the second to last field.
It would counting messages in size bands, e.g. 0-10KB, 10-100KB,
100-1MB, 1MB-250MB, >250MB to get some size and frequency statistics.

2) Pick a message from each band and run it through spamc manually while
using Wireshark to capture both spamc-spamd traffic and spamd-MySQL
traffic. Combining the message sizes and counts from the two streams
should give you enough information to correctly size the traffic flows. 

====
Question to developers on this list: Why is a message that exceeds the
maximunm size skipped entirely? Is there a case for passing its headers
through spamd and then combining the returned headers with the body in
spamc? It would give a bit more protection and doesn't look too
difficult to do since spamd is already capable of handling just the
headers.
 

Martin


Reply via email to