On Thu, 21 Mar 2019, Martin Gregorie wrote:
On Thu, 2019-03-21 at 09:23 -0700, John Hardin wrote:
On Thu, 21 Mar 2019, Savvas Karagiannidis wrote:
What should be considered is the message's language. All messages
that were
false positives had the following mime encoding (messages were
actually in
greek):
Content-Type: text/[plain|html]; charset="windows-1253" or
Content-Type: text/[plain|html]; charset="iso-8859-7"
while all messages that were actual spam and were properly detected
had:
Content-Type: text/[plain|html]; charset="utf-8"
It should be fairly easy to add an exclusion based on that
information.
However, that information may well be leveraged by spammers who are
using that obfuscation...
FWIW roughly 10% of my spam corpus uses <font> tags to set white text.
...wrong thread? :)
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
...the Constitution and the Bill of Rights exists to protect
the individual, not the mob. -- Matt Pickering
-----------------------------------------------------------------------
721 days since the first commercial re-flight of an orbital booster (SpaceX)