On Mon, 12 Sep 2016, thomas cameron wrote:

On 09/12/2016 01:06 PM, John Hardin wrote:
On Mon, 12 Sep 2016, thomas cameron wrote:


Make sure you have a local recursing (**NOT** forwarding) DNS server
that your MTA and SA are configured to use. Reason: if you're forwarding
your MTA DNS requests to your ISP's DNS server, the aggregated traffic
of you plus all the other ISP clients can exceed the various DNSBL and
URIBL free-usage limits, rendering those tools useless.

[root@mail-west ~]# grep recurs /etc/named.conf
        allow-recursion { 127.0.0.1; };

A clear indicator this is happening: URIBL_BLOCKED hits.

I see "URIBL_BLACK Contains an URL listed in the URIBL blacklist" in the
headers of many of the messages that got through. Is that what you mean?

No. URIBL_BLACK indicates your URIBL queries are succeeding, that's a hit. URIBL_BLOCKED means "request blocked", probably due to exceeding the limits.

Train up your Bayes using hand-vetted spam *and* ham, at least 200 of
each. Using autolearn initially can be problematic, so disable that
until SA is doing a fairly good job using hand-trained Bayes. Then you
can let autolearn keep it up-to-date if you like, and continue to
capture and manually train any persistent misses or near-misses.
Generally the more you feed Bayes the better it performs, but it must be
accurately classified. If you feeed garbage to Bayes, you'll get garbage
results.

Good to know, thanks. I am running sa-learn --ham --mbox $MAIL now. I've
been running sa-learn --spam against the spam messages I've moved to my
spam folder, but forgot to teach it about ham.

It's a really bad idea to train your inbox as ham. There may be stuff (specifically, FNs) in there you haven't seen yet or haven't removed. Keep a separate train-as-ham folder that you manually populate after actually looking at the messages, just like you're keeping a train-as-spam folder.

You might want to wipe and retrain from scratch after setting that up, especially if you're seeing low BAYES score hits on spams and FPs.

Are you seeing any BAYES rule hits at all yet?

Keep hand-classified Bayes corpora around in case you ever need to wipe
and retrain from scratch.

OK.

Ensure you're training Bayes as the user that SA is running under.
Training the wrong Bayes database is a common cause of problems.

It's a small server, so I'm doing this via procmail and spamc.
Everything runs in the context of the individual users. I need to run
sa-learn --ham as each user against their inboxes, I guess. I can add
cron jobs for each user to do that.

You might also consider running a shared/global Bayes, if all your users' mail streams are fairly similar w/r/t "what is ham?" There should be instructions in the SA wiki for setting up shared/global Bayes.

Consider doing some MTA-level DNSBL checks. The Zen DNSBL is
well-regarded. If you're using Postfix then there are some emails from
Reindl Harald on this list regarding weighted DNSBL scoring that you may
find useful. You'll have to search the archives to find those.

I'm using sendmail, and I have these checks on:

FEATURE(`dnsbl',`in.dnsbl.org ')dnl
FEATURE(`dnsbl',`sbl-xbl.spamhaus.org')dnl
FEATURE(`dnsbl',`cbl.abuseat.org')dnl

I will add FEATURE(`dnsbl',`zen.spamhaus.org')dnl to it.

Zen incorporates a couple of the ones you're already using, don't double up.

There are some other MTA-level checks you can perform, like greet pause
and HELO validation (e.g. reject if the HELO has no dots).

Like this? http://www.harker.com/sendmail/checkhelo.html

Here's greet pause:

    FEATURE(`greet_pause',3000)dnl

I use milter-regex for HELO checks, it's a lot easier than hacking sendmail.cf (pokes sigmonster). You might consider milter-regex and take a look at this:

  http://www.impsec.org/~jhardin/antispam/milter-regex.conf

There are some things in there specific to a very small install, for example I expect all mail legitimately from my domain to be coming in from localhost so a HELO in my domain on the real IP is always bogus. Don't just adopt that config blindly.

Consider greylisting.

I am using milter-greylist, and it is very helpful. A lot of these
messages are actually skipping greylisting, though!

Greylisting isn't a panacaea. There *are* spambots who retry, and spammers who send through real MTAs. It helps reduce the cheap anklebiters, though.

X-Greylist: Sender passed SPF test, not delayed by
milter-greylist-4.5.16 (XXX [XXX.XXX.XXX.XXX]); Mon, 12 Sep 2016
18:11:18 +0000 (UTC)

You might not want to bypass greylisting based on SPF. If the sender is using a spam domain, they could easily set up "accept from 0.0.0.0/0" in that domain's SPF.

Keep the tips coming, I appreciate learning from you!

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  "Bother," said Pooh as he struggled with /etc/sendmail.cf, "it never
  does quite what I want. I wish Christopher Robin was here."
                                           -- Peter da Silva in a.s.r
-----------------------------------------------------------------------
 5 days until the 229th anniversary of the signing of the U.S. Constitution

Reply via email to