On Thu, 15 Sep 2016, John Hardin wrote:
On Wed, 15 Sep 2016, Chip M. wrote:
Sadly, I have more FP data for you. :(
Here's one specific example (just a single very long line from
one corpse):
background-image: url("data:image/svg+xml;charset=utf8,%3Csvg
width='104px' height='82px' viewBox='0 0 104 82' version='1.1'
xmlns='http://www.w3.org/2000/svg'
Ok, I excluded image data from URI_DATA. This should reduce FPs without
hurting spam/phish detection (I hope).
...and now __URI_DATA isn't hitting *anything*.
I suspect that the only data: URLs in the masscheck corpora are for
embedded images. This makes sense if they're being used primarily for
spearphishing.
Chip, could you send me some spamples of non-image data: messages
offlist? The only ones I have anywhere are images.
Thanks!
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Politicians never accuse you of "greed" for wanting other people's
money, only for wanting to keep your own money. -- Joseph Sobran
-----------------------------------------------------------------------
Tomorrow: the 229th anniversary of the signing of the U.S. Constitution