Does anybody have ExtractText working with SA 3.4?
http://whatever.truls.org/graphdefang/ExtractText.zip

I loved this third party plugin back in SA 3.2.5.
Every once in a while some attachment spam gets through.

unrtf on command line works giving expected output.
/usr/local/bin/unrtf -t ExtractText.tags -nopict  RTF.rtf

Debug output shows nothing extracted.

Mar  7 10:22:15.405 [18289] dbg: extracttext: set: magic=1
Mar  7 10:22:15.405 [18289] dbg: extracttext: external: antiword 
"/usr/bin/antiword","-t","-w","0","-m","UTF-8.txt","-"
Mar  7 10:22:15.406 [18289] dbg: extracttext: use: antiword name .*\.doc
Mar  7 10:22:15.406 [18289] dbg: extracttext: use: antiword name .*\.dot
Mar  7 10:22:15.406 [18289] dbg: extracttext: use: antiword type 
application/(?:vnd\.?)?ms-?word.*
Mar  7 10:22:15.406 [18289] dbg: extracttext: external: unrtf 
"/usr/local/bin/unrtf","-t","ExtractText.tags","--nopict"
Mar  7 10:22:15.406 [18289] dbg: extracttext: use: unrtf name .*\.doc
Mar  7 10:22:15.407 [18289] dbg: extracttext: use: unrtf name .*\.rtf
Mar  7 10:22:15.407 [18289] dbg: extracttext: use: unrtf type application/rtf
Mar  7 10:22:15.407 [18289] dbg: extracttext: use: unrtf type text/rtf
Mar  7 10:22:15.407 [18289] dbg: extracttext: external: odt2txt 
"/usr/bin/odt2txt","--encoding=UTF-8","${file}"
Mar  7 10:22:15.407 [18289] dbg: extracttext: use: odt2txt name .*\.odt
Mar  7 10:22:15.407 [18289] dbg: extracttext: use: odt2txt name .*\.ott
Mar  7 10:22:15.408 [18289] dbg: extracttext: use: odt2txt type 
application/.*?opendocument.*text
Mar  7 10:22:15.408 [18289] dbg: extracttext: use: odt2txt name .*\.sdw
Mar  7 10:22:15.408 [18289] dbg: extracttext: use: odt2txt name .*\.stw
Mar  7 10:22:15.408 [18289] dbg: extracttext: use: odt2txt type 
application/(?:x-)?soffice
Mar  7 10:22:15.408 [18289] dbg: extracttext: use: odt2txt type 
application/(?:x-)?starwriter
Mar  7 10:22:15.409 [18289] dbg: extracttext: external: pdftohtml 
"/usr/bin/pdftohtml","-i","-xml","-stdout","-noframes","${file}"
Mar  7 10:22:15.409 [18289] dbg: extracttext: external: pdftotext 
"/usr/bin/pdftotext","-q","-nopgbrk","-enc","UTF-8","${file}","-"
Mar  7 10:22:15.409 [18289] dbg: extracttext: use: pdftotext name .*\.pdf
Mar  7 10:22:15.409 [18289] dbg: extracttext: use: pdftotext type 
application/pdf
Mar  7 10:22:18.048 [18289] dbg: extracttext: MIME database: /usr/share/mime
Mar  7 10:22:18.152 [18289] dbg: extracttext: Part: application/rtf RTF.rtf
Mar  7 10:22:18.152 [18289] dbg: extracttext: Match: name "RTF.rtf" =~ ".*\.rtf"
Mar  7 10:22:18.213 [18289] dbg: extracttext: External call: unrtf 
"/usr/local/bin/unrtf","-t","ExtractText.tags","--nopict"
Mar  7 10:22:18.214 [18289] info: extracttext: External extraction command: 
"/usr/local/bin/unrtf","-t","ExtractText.tags","--nopict"
Mar  7 10:22:18.214 [18289] info: extracttext: External extraction object: 17 
application/rtf "RTF.rtf"
Mar  7 10:22:18.214 [18289] info: extracttext: External extraction error: unrtf 
0 ?
Mar  7 10:22:18.259 [18289] dbg: extracttext: Not extracted
Mar  7 10:22:18.259 [18289] dbg: extracttext: X-ExtractText-Words: 0
Mar  7 10:22:18.259 [18289] dbg: extracttext: X-ExtractText-Chars: 0
Mar  7 10:22:18.389 [18289] dbg: bayes: header tokens for x-extracttext-chars = 
" 0"
Mar  7 10:22:18.389 [18289] dbg: bayes: header tokens for x-extracttext-words = 
" 0"

Thanks,
Scott Ostrander

Reply via email to