Re: New type of obfuscation?

John Hardin Wed, 31 Dec 2014 19:02:07 -0800

On Wed, 31 Dec 2014, Martin Gregorie wrote:

During last night I received a phishing message with a new (to me
anyway) form of obfuscation which can only be used inside HTML body text
using us-ascii encoding. The obfuscation was apparently aimed at SA and
similar scanners because its not obvious to anybody reading the message:
every 'o' (0x6f) in the text is replaced by &#959;


My Perl-fu isn't good enough to encode this in a regex - can anybody
help?

Take a look at 25_replace.cf (esp. tags C and E), and the various FUZZY_*rules. It's not feasible to do broadly, but specific commonly-obfuscatedwords and short phrases can be focused on and that potentially would helpBayes recognize such as spammy more quickly.

I've been extending 25_replace.cf as I see more different types ofobfuscation like this, but it's a bit hard to keep up. Given a list ofUnicode code points that look like specific Latin letters, it should notbe hard to automatically generate the tag subrules for obfuscation forall the encodings.

Is there such a list anywhere already that could be leveraged? I know wewere discussing unicode normalization of body text at one point, is thereanything there we could use?


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  It is not the business of government to make men virtuous or
  religious, or to preserve the fool from the consequences of his own
  folly.                                              -- Henry George
-----------------------------------------------------------------------
 944 days since the first successful private support mission to ISS (SpaceX)

Re: New type of obfuscation?

Reply via email to