Matus UHLAR - fantomas wrote:
Yes, but generic plugin should be able extract images for later processing
> (FuzzyOCR or maybe even things like Bayes) too ;)
That would depend on what you mean by "generic". :-)
It's a generic text extractor plugin, with the ability to call an OCR
program for getting text from images. Wich is what I wanted, and is what
John mentioned in his post.
It's not a generic attachment parser and object extractor (though it
might become one).
I do want it to be able to add stuff rendered to HTML, but
Mail::SpamAssassin::Message::Node doesn't (currently) have a
set_rendered variant for doing that, and I haven't had the time to work
on Mail::SpamAssassin::Message::Node.
I'm not sure exactly what would be the correct way to add parts (such as
extracted images) to the message. I have thought about it, and the
plugins plugin architecture does support this. I just haven't had the
time to find out how to do it.
I don't know what you mean by "even things like Bayes". The plugin does
make the extracted text available to bayes (this is what I made it for),
and it can call OCR programs.
Making extracted images available for FuzzyOCR is (as mentioned above)
something I want to do. Since I don't do any OCR at all here, that's a
pretty low priority though (unless people start asking for it more).
Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/