On Mon, 12 Oct 2009, McDonald, Dan wrote:
We are getting a number of word docs with scams in them.
Likewise, as well as PDFs.
The word doc has a pretty standard 419 body in it, I recall some mutterings on this list about using wvHtml to regularize word docs.
There were mutterings about a generic plugin that would take an attachment, process it somehow (e.g. wvHtml, antiword, ps2ascii, or whatever was appropriate), and insert the results into the body text to be scanned by the regular rules. I don't think anything has come of that yet.
I'm also thinking of looking for a word attachment in messages with FREEMAIL_REPLYTO, and would appreciate thoughts on that. Something like: mimeheader __ANY_WORD_ATTACH Content-Type =~ #application/msword#/i
There already some attachment type rules in my sandbox that you might want to leverage instead of starting from scratch.
http://ruleqa.spamassassin.org/?daterev=20091011-r824040-n&rule=%2F__(%3F%3ADOC|PDF)_ATTACH&srcpath=jhardin
meta AE_FREE_WORD (__ANY_WORD_ATTACH && FREEMAIL_REPLYTO) score AE_FREE_WORD 1.5
I _almost_ have that in my sandbox already. I'll put that into the sandbox, but I'm not sure there is much of that style spam in the masscheck corpora yet (see the above URI).
-- John Hardin KA7OHZ http://www.impsec.org/~jhardin/ jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- Warning Labels we'd like to see #1: "If you are a stupid idiot while using this product you may hurt yourself. And it won't be our fault." ----------------------------------------------------------------------- 11 days since a sunspot last seen - EPA blames CO2 emissions