Peter Mikeska (MiKi) wrote:
here are my 2 cents ;)
I'm not Joe, but I have to disagree with some of your points. <g>
no word about MTA, from other answers its look like sendmail.
for high volume and this kind of things there is something fast and
relatively easy ;)
Umm... no. Switching MTAs on a major mail cluster is NOT trivial or easy.
qmail also requires a fair amount of patching to behave "properly" AKA
"according to generally accepted standards of behaviour". This may
admittedly be *easier* in a larger environment because you can build
once and copy that to many systems.
- use qmail as GW for outbond mails / dont know how webamail is using
smtp, but can be setup to use qmail IP
- on qmail use simscan with SA - its in C and thus fast
Your MTA and MTA-content-scanner-glue are trivial loads compared to SA
itself. (IIRC Joe note that he would probably run spamd on remote hosts
instead of SA integrated with the scanner (eg MIMEDefang), so that
alters the structure a fair bit as well.)
- in simscan you can easily set quarantine to directory, and best
thing is that message in qrt. is untouched, clear message as come to
smtp.
The same applies to any other properly-written MTA content scanner. As
noted by several other people, header tests won't be nearly as much use
in determining spamminess as they might be on the recipient side of things.
Joe, one thing I'd suggest is to try to capture "legitimate" vs "spam"
mail for a short time, and regenerate the entire set of SA scores from
that data. IMO it'll likely make your system much more accurate because
the message body tests will end up being weighted much higher due to the
limited usefulness of most header tests.
-kgd