On Sun, 27 Aug 2006, Justin Mason wrote: > "John D. Hardin" writes: > >On Sat, 26 Aug 2006, Loren Wilton wrote: > >> > That's what I was thinking, and would allow leverage by a lot of > >> > plugins (e.g. the Word plugin I am prepping to start)... > >> > > >> > Create some PerMsgStatus string variable or some such that the body > >> > rules would be run over... > >> > >> Actually the easy way would probably be to create a new X-Spam > >> header item that rules could run on. > > > >...an X-Spam-mumble header containing the text extracted from an > >attached Word document? That somehow strikes me as a bad idea... > > Actually, I think it's quite a good one ;) headers provide a > good way for plugins to offer name=value metadata pairs for rules > to match on.
Well, yes, so long as the header does not get inserted into the rewritten message. However, there is a much richer set of body text rules than header rules. I think they should be leveraged against the image text (and attached document text) as well. After all, they are just variant delivery methods for the same message: BUY MY SHIT^WSTUFF! > The idea of sticking text from OCR'd images into the body is > interesting -- however, I'm not sure it'd be useful in this case. > One key aspect that makes the rules accurate, is that it's not > that the text appears *anywhere* in the mail; it's that the text > appears in an OCR'd image. Okay, how about this: a "variant-encapsulation" object in $PMS where the text from images/documents is stuffed, and has the body rules run over it, and has a multiplier or threshhold or some such that affects/controls how the score from the body rules against that block of text are applied to the message as a whole. What bothers me is the separate list of simplified matching rules that FuzzyOCR is using. I think that it would be better in the long run to leverage the rich set of existing body rules rather than having a separate set of simple rules. -- John Hardin KA7OHZ ICQ#15735746 http://www.impsec.org/~jhardin/ [EMAIL PROTECTED] FALaholic #11174 pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- People seem to have this obsession with objects and tools as being dangerous in and of themselves, as though a weapon will act of its own accord to cause harm. A weapon is just a force multiplier. It's *humans* that are (or are not) dangerous. ----------------------------------------------------------------------- 23 days until Talk Like a Pirate day