Ryan Kather wrote:

I'll answer some parts...

Ideas: -------- Postfix- I would prefer to use SpamAssassin as a
store and forward mail filtering relay appliance.  It seems if I
place a Postfix Linux MTA in front of my existing spam solution I
could setup test groups.  100 users could be forwarded to the
SpamAssassin test box and passed internally to GroupWise.  100 users
could be forwarded to the DSPAM test box and passed internally to
GroupWise.  The rest of the users would be forwarded to the Symantec
Mail Security Gateway and passed internally to GroupWise (until such

Wouldn't it make more sense to pass the same message through each system under test?

I would prefer to use LDAP to validate recipients for SpamAssassin
and DSPAM which should be possible with Postfix.

Yup! LDAP forever!

I think I could accomplish this scenario with Postfix Transports,
though I may need to run multiple instances of Postfix.  Does anyone
see a flaw in this?

You should be able to lookup on LDAP a custom attribute that means next-hop hostname. You need some LDAP work, but very basic, and you're set!

possible to provide a fair performance picture versus SpamAssassin

Performance... are you hunting for speed or accuracy?
(perhaps you wrote it before and I missed it)

It appears many seem to be using the Amavsid-new + Postfix +
SpamAssassin configuration.  Is there a reason not to use this
design?  I have had good luck with this in the past.

This is a very good combination. Amavisd-new allows per-user (!) LDAP profiling and SQL quarantine management. I'm running both Postfix+SA and postfix+amavis+SA+clamav+mailzu+LDAP on two different MX for different domains. Although the latter setup requires more powerful hardware (not necessarily if your 4000 users have a steady traffic and won't grow), it is much more manageable.

Your review should take into account also these frills!

I also have read a lot where people are improving accuracy by
increasing the scoring of the Bayesian database (which needs
[...]
can I insure user false positives are easily reportable?  What do
others do to train the Bayesian database?  Maia-Mailguard?

After the initial setup, Bayes can live more or less its own life with broad enough autolearn thresholds. We do not let users submit stuff for training (80kusers!) but rather submit meaningful samples occasionally.

We've also found that spammers are targeting common addresses such as info@, software@, john@, ... which were not used on some domains. So we transformed those into spamtraps (with LDAP's mailAcceptingGeneralId or mailAlternateAddress is pretty straightforward!), manually review and feed to an IMAP folder for autospamlearn. HAM learning is unfortunately underestimated and more rarely done, out of our own HAM messages.

I could pretty much trust a small subset of users to be fairly
regular in their training.  There is a somewhat larger portion of

They might be telling less trusty users how to take part in the training process, and then break-up your Bayes DB. Those less-smart users should be managed with amaivsd-new LDAP profiling.

use some kind of common database.  In the default configuration SA
uses one Bayesian database for all users.  Is there a reason to
change this?  What is the consensus on a shared ruleset versus
individual rulesets?

If your users share common-type messages, I'd go for a common Bayes DB. We do have a common one for all our domains (actually one for old and another for new SA servers). Individual Bayes DBs get large and if they break you've got to troubleshoot each individually...

Shared rulesets, with custom rules for special cases (using SA __META rules) evaluated for each message.

It also seems that there is a falling out between pyzor, dcc, razor,
and the community.  Is it simply a licensing issue (with legal

Can't comment on this, but consider running a DNS cache if you plan to use SA's DNS tests!

What about an initial corpus to train the Bayesian database?  Will
this hurt my accuracy in the long term?  What corpuses are being
used?  Am I better off letting the Bayesian autolearn gradually
perform this function?

You don't keep your spam, do you? :-) Train the DB with your *own* (company's) spam and ham corpus. It will not hurt. Don't use public corpuses.


SpamAssassin is typically represented as a magic dance of tweaking
rules.  Are the default rule thresholds good values to start at?  How
can I adequately decide which rules to tweak and how much to tweak
them by?  In other words, how do you manage your adjustments without
users noticing wide spam classifying variations?

We do not adjust rules scoring. Not with SA 3.1, while we did it on SA 2.6 Bayes scores. Since most of our traffic is non-English, this helped a bit.

Default values are the most suitable for each rule.

Also, in regards to rules.  What is the preferred method for update?
Official rule releases, rulesdujour, custom?  All of the above?

Test them and decide which apply to your case. Dunno how indipendent your current antispam solution is, with SA you need to invest some time to review false negatives/positives (if any) and review extra rulesets.

How have people faired with MySQL replication of the DB?  I will need
this solution to present the same data for backup MX which is not
local to the primary MX.

First of all: we dropped the secondary MX record because it received more spam than primary. We use a load balancer for HA.

What do you want to store on MySQL? Bayes, AWL, quarantine are your non-mutually exclusive options.

Bayes and AWL can be regenerated in matter of minutes, and you can start (I mean "power up") a backup MX without them. Replicating quarantine is like replicating your trash between two bins. If you provide delegated quarantine, how likely is that a HW failure will destroy a false positive? You're probably better off without MySQL master-slave replication hassle.

AFAIK there is a MySQL master-master replication function, but its limitations make it incompatible with amavis SQL needs.


<OT MODE ON>
X-Mailer: Novell GroupWise Internet Agent 6.0.4

OMG! It formatted your message paragraphs without breaking-up lines! Luckily Thunderbird has a rewrap function!

<OT MODE OFF>

Have a nice weekend!
Paolo Cravero

--
|    QRPp-I #707  + www.paolocravero.tk +  I QRP #476   |
| SpamAssassin-based email antispam/antivirus solutions |
 \    Italian/English-to/from-Croatian translations    /
  \                   Skype: pcravero                 /

Reply via email to