On 15/06/2016 22:42, Dianne Skoll wrote:
On Wed, 15 Jun 2016 13:40:25 -0700 (PDT)
John Hardin <jhar...@impsec.org> wrote:
That's (more or less) "Quoted Printable" encoding.
AFAIK, SpamAssassin "body" rules are applied after the
Content-Transfer-Encoding: has been decoded. So the QP equal signs
are a red herring.
Regards,
Dianne.
Yes, I thought that too.
I have written my own rules occasionally and being a total novice I just
set about it using trial and error without understanding all this
encoding stuff. And in so doing I found that 'line-wrapped' words
(delimited with the equals sign) are deciphered and applied to the rule
accordingly.
Here is a real example:
body __MY_PHISH_CIRCUMVENT_ATTEMPT3
/((?!account)(\xD0\xB0|a)(\xD1\x81|c){2}(\xD0\xBE|o)u(\xD5\xB8|n)t|(?!customer)(\xE1\xB4\x84|c)u(\xD1\x95|S)t(\xD0\xBE|o)mer|(?!verif(y|i))ver(\xD1\x96|i)f((\xD1\x83|y)|
(\xD1\x96|i)))/i
(effectively looking for sneaky encrypted characters to look-like real
letters to make words such as "account", "customer" and
"verify"/"verifi") - definitely phishing and dodgy if this exists).
And this is REAL body text from an email:
-- SNIP ---------------------------------------
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
.
.
.
<TD style=3D"FONT-FAMILY: Helvetica, Arial, sans-serif; COLOR: rgb(102,102,1=
02); PADDING-BOTTOM: 15px; PADDING-TOP: 15px; PADDING-LEFT: 0px; PADDING-RIG=
HT: 0px" width=3D471 align=3Dleft><FONT size=3D2 face=3D"Arial,elv Hetica, s=
ans-serif"><STRONG>da...@decrofloor.co.uk</STRONG> - =D0=85=D0=B5=D1=81=
ur=D1=96t=D1=83 m=D0=B5=D0=B0=D1=95ur=D0=B5=D1=95 h=D0=B0=D1=95 b=D0=B5=D0=
=B5n =D0=B0=D1=80=D1=80=D3=8F=D1=96=D0=B5d t=D0=BE =D1=83=D0=BEur =D0=B0=D1=
=81=D1=81=D0=BEu=D5=B8t.</FONT></TD>
-----------------------------------------
I can tell you that the very last word/sequence of characters:
=D0=B0=D1=
=81=D1=81=D0=BEu=D5=B8t
get caught despite being separated and line-wrapped with an equals sign
(FYI they look like "ассоuոt." - account).