On Mon, 12 Oct 2009, McDonald, Dan wrote:

We are getting a number of word docs with scams in them.

Likewise, as well as PDFs.

The word doc has a pretty standard 419 body in it, I recall some mutterings on this list about using wvHtml to regularize word docs.

There were mutterings about a generic plugin that would take an attachment, process it somehow (e.g. wvHtml, antiword, ps2ascii, or whatever was appropriate), and insert the results into the body text to be scanned by the regular rules. I don't think anything has come of that yet.

I'm also thinking of looking for a word attachment in messages with
FREEMAIL_REPLYTO, and would appreciate thoughts on that.

Something like:
mimeheader __ANY_WORD_ATTACH Content-Type =~ #application/msword#/i

There already some attachment type rules in my sandbox that you might want to leverage instead of starting from scratch.

http://ruleqa.spamassassin.org/?daterev=20091011-r824040-n&rule=%2F__(%3F%3ADOC|PDF)_ATTACH&srcpath=jhardin

meta    AE_FREE_WORD    (__ANY_WORD_ATTACH && FREEMAIL_REPLYTO)
score   AE_FREE_WORD    1.5

I _almost_ have that in my sandbox already. I'll put that into the sandbox, but I'm not sure there is much of that style spam in the masscheck corpora yet (see the above URI).

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 Warning Labels we'd like to see #1: "If you are a stupid idiot while
 using this product you may hurt yourself. And it won't be our fault."
-----------------------------------------------------------------------
 11 days since a sunspot last seen - EPA blames CO2 emissions

Reply via email to