Hello again,
I'm replying to my own message because I don't want to single out anyone
who has already replied. There was value in each of your responses.
This is going to be a long email, and for that you have my apologies,
but I can't think of any way to make it shorter without it sounding like
"dude, just trust me, I know what I'm doing", and I don't want to be
that guy.
Thank you for the time and thought that went into the replies. I
sincerely appreciate your concern for the effectiveness of our spam
filtering.
One thing is clear to me from those response is that I did a poor job
explaining the situation, and for that you also have my apologies.
This is my attempt to correct that error.
First, any configuration change I would make would be scoped to _only_
the one (1) recipient address in question - nothing will be changed
about what SA does for any other address that we handle.
Second, this recipient address is for a queue in an RTIR [1]
installation for one of our hosted customers. The purpose of this queue
is to receive reports of suspicious activity from elsewhere, and the
queue is worked by trained security professionals. This address is
already configured to receive all messages no matter what the spam score
is. It is their job to look at them and assess whether or not it
represents an incident, and if so what response is needed.
Third, to expand on something I alluded to briefly, the emails in
question are generated by a security appliance on our customer's
network, in accordance with their security policy and posture. The
warnings we're getting when our mail server performs these DNS queries
are coming from _our_ network infrastructure, which is AWS.
As I understand things, I have several options.
Option 1) Do nothing with any configuration.
We will continue to get notification from AWS of this suspicious
activity, often several times in one day, that we then have to go and
correlate with mail logs to confirm that the suspicious DNS queries
were, in fact, related to the spam filtering of an email, and that the
email in question was for this queue that is specifically for receiving
that sort of content.
The danger with this is that we will become lax in our checking of the
mail logs and that this will essentially devolve into a variant on
option 2, but with more work.
Option 2) Tell our network monitoring (AWS) to ignore these findings for
our mail servers.
This seems fairly reasonable, as we know that our mail servers will make
these queries semi-regularly as they are running the spam filtering on
the messages for this recipient.
The downside is that it will also turn off all notification to us if
similar content were to be received at another address, potentially one
that isn't handled by trained security professionals, or even if (heaven
forbid) our mail servers were to be compromised by bitcoin mining
malware. That last one shouldn't be possible due to other controls, but
there's no denying that there is some added risk in auto-ignoring these
warnings.
Option 3) Skip all spam checking for this recipient.
It is, after all, associated with an incident response queue, expected
to receive email messages with body content that contains names of, or
even links directly to, known malware domains, and is staffed by
security professionals.
And yet, this all-or-nothing approach feels like it's sacrificing some
possible good, which leads to my questions regarding a hypothetical
Option 4.
Option 4) Targeted disabling of specific checks for this _one_ recipient
that preserves as much value as possible for the remaining checks.
I think there are several variants here, and this is where I know that I
don't have the expertise necessary to make the optimal decision.
From what I can tell from the reports, the only queries that are
triggering the security alert for our mail servers are the ones made for
records in the (known malware) domain or one of its subdomains.
In the debug logs that I inspected there were three queries:
1) malwaredomain.com/A
2) subdomain.malwaredomain.com/A
3) malwaredomain.com/NS
As best as I can tell, the results of these queries were all used for
additional DNSRBL queries, but if AWS is noticing that part of the
context they aren't letting on.
Variant A) disable all DNSRBL checks. :(
Variant B) disable only those RBLs that ask for the information that
triggers these queries. This is an improvement, but it also skips those
same checks on everything in the message headers.
Variant C) disable some/most/all checks for names found only in the
message body. This would provide full checking of all names found in
the headers and skip only that content in which we expect to find trouble.
I believe I can do variants A) and B), so worst case would be choosing
B, but I'm willing to put in some additional work to implement variant
C) if that is possible.
If you've made it this far, I congratulate you on your endurance and
thank you for your time.
Thanks,
Brian
[1] https://github.com/bestpractical/rtir#readme