I am using Extracttext from http://whatever.frukt.org/spamassassin.text.shtml#ExtractText.pm It extracts text from attached .rtf .doc and some other formats. Then feeds the results to BAYES and normal body testing. My issues are that it works great with SA 3.2.5, However on the same server it does not give any results with SA 3.3.1 I downgraded SA back to 3.2.5 and Extracttext works again. The dbg output looks like this in 3.3.1: Jun 3 07:54:17.447 [11937] dbg: extracttext: Part: application/msword spam.doc Jun 3 07:54:17.447 [11937] dbg: extracttext: Match: name "spam.doc" =~ ".*\.doc" Jun 3 07:54:17.534 [11937] dbg: extracttext: External call: antiword "/usr/bin/antiword","-t","-w","0","-m","UTF-8.txt","-" Jun 3 07:54:17.537 [11937] info: extracttext: External extraction command: "/usr/bin/antiword","-t","-w","0","-m","UTF-8.txt","-" Jun 3 07:54:17.537 [11937] info: extracttext: External extraction object: 17 application/msword "spam.doc" Jun 3 07:54:17.538 [11937] info: extracttext: External extraction error: antiword 0 ? Jun 3 07:54:17.538 [11937] dbg: extracttext: Match: name "spam.doc" =~ ".*\.doc" Jun 3 07:54:17.538 [11937] dbg: extracttext: External call: unrtf "/usr/local/bin/unrtf","-t","ExtractText.tags","--nopict" Jun 3 07:54:17.539 [11937] info: extracttext: External extraction command: "/usr/local/bin/unrtf","-t","ExtractText.tags","--nopict" Jun 3 07:54:17.540 [11937] info: extracttext: External extraction object: 17 application/msword "spam.doc" Jun 3 07:54:17.540 [11937] info: extracttext: External extraction error: unrtf 0 ? Jun 3 07:54:17.616 [11937] dbg: extracttext: Magic: application/x-ole-storage Jun 3 07:54:17.617 [11937] dbg: extracttext: Not extracted Jun 3 07:54:17.617 [11937] dbg: extracttext: X-ExtractText-Words: 0 Jun 3 07:54:17.617 [11937] dbg: extracttext: X-ExtractText-Chars: 0
The dbg output looks like this in 3.2.5: [7828] dbg: extracttext: Part: application/msword spam.doc [7828] dbg: extracttext: Match: name "spam.doc" =~ ".*\.doc" [7828] dbg: extracttext: External call: antiword "/usr/bin/antiword","-t","-w","0","-m","UTF-8.txt","-" [7828] info: extracttext: Extracted 40 chars using antiword [7828] info: extracttext: Text: Viagra [7828] info: extracttext: Text: Free sex [7828] info: extracttext: Text: Free porn [7828] info: extracttext: Text: Cash Out Now [7828] dbg: extracttext: X-ExtractText-Words: 8 [7828] dbg: extracttext: X-ExtractText-Chars: 40 [7828] dbg: extracttext: X-ExtractText-Tools: antiword [7828] dbg: extracttext: X-ExtractText-Types: application/msword [7828] dbg: extracttext: X-ExtractText-Extensions: doc Any thoughts on how to get it to work with 3.3.1? _____________________________ Scott Ostrander Staff System Administrator