On Wed, 26 Jun 2019, Amir Caspi wrote:

John et al,

I recall from a prior thread last year that there were supposed to be some 
rules to check for zero-width joiner characters... but I'm seeing spams 
recently that have these, but don't hit any such rules.

Here's one spample, where the ZWJ entity #x200B is being used to try to 
sidestep Bayes detection of highly spammy words.
https://pastebin.com/kx0jVBtZ

I'll take a look. It's possible that there are some ZWJ the RE isn't looking for.


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 Warning Labels we'd like to see #1: "If you are a stupid idiot while
 using this product you may hurt yourself. And it won't be our fault."
-----------------------------------------------------------------------
 7 days until the 243rd anniversary of the Declaration of Independence

Reply via email to