Philip Prindeville wrote:
I’m getting the following Spam.

http://www.redfish-solutions.com/misc/bluechew.eml

Received: from phylobago.mysecuritycamera.org (ec2-34-210-5-63.us-west-2.compute.amazonaws.com [34.210.5.63])

I have a local rule adding a couple of points for anything coming direct-to-MX from any Amazon compute node, period.

I added this on the basis of Amazon's abuse-reporting web form insisting that activity from any given IP may be from many AWS customers over a span of a few minutes. Legitimate mail servers do not randomly change sending or receiving domains over this timespan, so therefore, Amazon compute nodes should not be sending direct-to-MX, at all, ever.

Reality has intruded and there are in fact static IP assignments in the .compute.amazonaws.com tree (as well as ISP customers of ours who have websites with webforms on AWS, which send mail to their ISP mailbox - or sometimes their domain mailbox that's hosted with us) - otherwise I'd have scored the rule a lot higher.

And this is notable for having:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"><style>

GUID1
GUID2
GUID3
GUID4
…
</style>

so it should be easy enough to detect.

I've posted about other variations on the "invalid <style> tag" subject in the past. I've settled for these rules:

full EXTRALONG_STYLE m|<style(?: type="text/css")?>[^<]{10000,30000}<\/style>|s

This is capped at a round 30K mainly because you have to recompile Perl to use quantifiers larger than 32767.

In practice I have historically seen <style> gibberish closing on 1MB, and I have also unfortunately seen nominally legitimate <style> tags pushing 50KB or so (I'm glaring at you, Outlook).

rawbody INVALID_STYLE m|(?!<style>\s+</style>)<style>[a-z0-9\s\n/'-]+</style>|i

This should sweep up a lot of garbage that isn't valid CSS. The specifics of what kind of garbage aren't really interesting over the long term. It doesn't match this particular example, but if you switch it to a full rule as well, it does. I think I was targeting (much) shorter blobs that didn't hit EXTRALONG_STYLE in using rawbody instead of full; it's been a while since I've looked closely at these or seen any FNs that seemed relevant.


The 2nd type of Spam I’m seeing looks like:

http://www.redfish-solutions.com/misc/received-spf.eml

which contains:

Received: from mta.amapspa.it ([127.0.0.1])
        by localhost (mta.amapspa.it [127.0.0.1]) (amavisd-new, port 10026)
        with ESMTP id U5M-E2lVwWem; Sat,  2 Nov 2019 00:19:36 +0100 (CET)
Received-SPF: none (amapspa.it: No applicable sender policy available) receiver=mta.amapspa.it; 
identity=mailfrom; envelope-from="dario.scarpu...@amapspa.it"; 
helo="[91.134.159.128]"; client-ip=91.134.159.128
Received-SPF: none (amapspa.it: No applicable sender policy available) receiver=mta.amapspa.it; 
identity=mailfrom; envelope-from="dario.scarpu...@amapspa.it"; 
helo="[91.134.159.128]"; client-ip=91.134.159.128
Received-SPF: none (amapspa.it: No applicable sender policy available) receiver=mta.amapspa.it; 
identity=mailfrom; envelope-from="dario.scarpu...@amapspa.it"; 
helo="[91.134.159.128]"; client-ip=91.134.159.128
…

with that line being repeated some 40 times, each line being identical.

I tried a rule like:

header __L_RECEIVED_SPF         exists:Received-SPF
tflags __L_RECEIVED_SPF         multiple maxhits=20

meta L_RECEIVED_SPF             (__L_RECEIVED_SPF >= 10)
describe L_RECEIVED_SPF         Crazy numbers of Received-SFP headers
score L_RECEIVED_SPF            20.0


but it never seems to match.  I’ve not tried to debug this, but it seems that 
duplicated headers might not be saved as a list into the headers?  (Is there an 
easy way to see what exists:Received-SPF is evaluating as?)

IIRC if you want to catch something like this you have to use the ALL metaheader and chew through the entire header section as a block. Something like:

header __MANY_RCVD_SPF  ALL:raw =~ /^Received-SPF:/m
tflags __MANY_RCVD_SPF  multiple maxhits=20
meta MANY_RCVD_SPF      __MANY_RCVD_SPF >= 10

exists: in particular I think is effectively boolean, by definition it either hits once or it doesn't hit at all.

*prods rules*

This also works:

header MANY_RCVD_SPF    Received-SPF =~ /.+/
tflags MANY_RCVD_SPF    multiple maxhits=20

I'm not sure which is more efficient.

-kgd

Reply via email to