Re: how to write body rules to match 'tortured html' variations of text phrases?

Groach Wed, 15 Jun 2016 13:58:16 -0700


On 15/06/2016 22:42, Dianne Skoll wrote:

On Wed, 15 Jun 2016 13:40:25 -0700 (PDT)
John Hardin <jhar...@impsec.org> wrote:

That's (more or less) "Quoted Printable" encoding.

AFAIK, SpamAssassin "body" rules are applied after the
Content-Transfer-Encoding: has been decoded.  So the QP equal signs
are a red herring.

Regards,

Dianne.

Yes, I thought that too.

I have written my own rules occasionally and being a total novice I justset about it using trial and error without understanding all thisencoding stuff. And in so doing I found that 'line-wrapped' words(delimited with the equals sign) are deciphered and applied to the ruleaccordingly.


Here is a real example:

body __MY_PHISH_CIRCUMVENT_ATTEMPT3/((?!account)(\xD0\xB0|a)(\xD1\x81|c){2}(\xD0\xBE|o)u(\xD5\xB8|n)t|(?!customer)(\xE1\xB4\x84|c)u(\xD1\x95|S)t(\xD0\xBE|o)mer|(?!verif(y|i))ver(\xD1\x96|i)f((\xD1\x83|y)|

(\xD1\x96|i)))/i

(effectively looking for sneaky encrypted characters to look-like realletters to make words such as "account", "customer" and"verify"/"verifi") - definitely phishing and dodgy if this exists).


And this is REAL body text from an email:

-- SNIP ---------------------------------------
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

.
.
.
<TD style=3D"FONT-FAMILY: Helvetica, Arial, sans-serif; COLOR: rgb(102,102,1=
02); PADDING-BOTTOM: 15px; PADDING-TOP: 15px; PADDING-LEFT: 0px; PADDING-RIG=
HT: 0px" width=3D471 align=3Dleft><FONT size=3D2 face=3D"Arial,elv Hetica, s=
ans-serif"><STRONG>da...@decrofloor.co.uk</STRONG>&nbsp;- =D0=85=D0=B5=D1=81=
ur=D1=96t=D1=83 m=D0=B5=D0=B0=D1=95ur=D0=B5=D1=95 h=D0=B0=D1=95 b=D0=B5=D0=
=B5n =D0=B0=D1=80=D1=80=D3=8F=D1=96=D0=B5d t=D0=BE =D1=83=D0=BEur =D0=B0=D1=
=81=D1=81=D0=BEu=D5=B8t.</FONT></TD>
-----------------------------------------


I can tell you that the very last word/sequence of characters:

=D0=B0=D1=
=81=D1=81=D0=BEu=D5=B8t

get caught despite being separated and line-wrapped with an equals sign(FYI they look like "ассоuոt." - account).

Re: how to write body rules to match 'tortured html' variations of text phrases?

Reply via email to