> John Hardin wrote: > >> There were mutterings about a generic plugin that would take an >> attachment, process it somehow (e.g. wvHtml, antiword, ps2ascii, or >> whatever was appropriate), and insert the results into the body text to >> be scanned by the regular rules.
On 13.10.09 11:13, Jonas Eckerman wrote: > That sounds very much like my ExtractText plugin. It can use command > line tools or perl plugins to extract text from attachments. Yes, but generic plugin should be able extract images for later processing (FuzzyOCR or maybe even things like Bayes) too ;) > There were a bit more than mutterings about it here. :-) > > > I don't think anything has come of that yet. > > The plugin works, and we use are using it in our mail gateway. > > It's listed on the Custom Plugins wiki page, and is available at > <http://whatever.frukt.org/spamassassin.text.shtml>. > > It comes with a config for extracting text from Word, OpenXML, RTF, ODF > and PDF files. great... -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Windows found: (R)emove, (E)rase, (D)elete