Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Justin Mason Sun, 27 Aug 2006 03:24:19 -0700

"John D. Hardin" writes:
>On Sat, 26 Aug 2006, Loren Wilton wrote:
>> > That's what I was thinking, and would allow leverage by a lot of
>> > plugins (e.g. the Word plugin I am prepping to start)...
>> >
>> > Create some PerMsgStatus string variable or some such that the body
>> > rules would be run over...
>> 
>> Actually the easy way would probably be to create a new X-Spam
>> header item that rules could run on.
>
>...an X-Spam-mumble header containing the text extracted from an
>attached Word document? That somehow strikes me as a bad idea...


Actually, I think it's quite a good one ;)  headers provide a
good way for plugins to offer name=value metadata pairs for rules
to match on.

The idea of sticking text from OCR'd images into the body is interesting
-- however, I'm not sure it'd be useful in this case. One key aspect that
makes the rules accurate, is that it's not that the text appears
*anywhere* in the mail; it's that the text appears in an OCR'd image.

>> I think it would be easy enough for the plugin to stick text into
>> the body array if it wanted to, and it if ran early enough that it
>> would be useful.  Whether or not the ocr text would be useful for
>> body rules is an entirely different question.
>
>The text within an attached image or document will have verbiage
>similar to the text within a classical spam - the goal, after all, is
>to sell something to the victim.
>
>I can see it now: spammers reduced to sending obfuscated text rendered
>as an animated GIF embedded in a Word document in a Zip file attached
>to an email whose subject is "Invoice #437892" with no body text... :)

and people would still read it ;)

--j.

Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Reply via email to