ExtractText can do any attachment you want. However, it's just a
framework. We use it to call a script that extracts images, office
documents, pdf files and passes them through a range of tools (antiword,
unzip [newer Office are just XML files in a zip file], gocr) and
converts them into text - whi
Thank you Jari,
> > In the same way, I am wondering if something similar exists for all
> > the (open|libre|MS)office documents?
> ExtractText
>
> Works with this documents as well as with PDF.
I had a look at ExtractText, but it only extract text,not the images.
And antiword, the extractor for
on anything newer than 3.2.5
>
> Scott
>
>> -Original Message-
>> From: Jari Fredriksson [mailto:ja...@iki.fi]
>> Sent: Thursday, November 15, 2012 6:01 AM
>> To: users@spamassassin.apache.org
>> Subject: Re: PDFassassin
>>
>> 15.11.2012 13:31,
On Thu, 15 Nov 2012, Olivier Nicole wrote:
Finally, I am wondering if fuzzyOCR still has any interest? Like
above, I'd like to see it push the stings it can identify to the body
of the message, for further analysis by SA, rather than having it's
own list of spam words.
I believe that FuzzrOCR
15.11.2012 13:31, Olivier Nicole kirjoitti:
> In the same way, I am wondering if something similar exists for all
> the (open|libre|MS)office documents?
ExtractText
Works with this documents as well as with PDF.
--
Fame is a vapor; popularity an accident; the only earthly certainty is
oblivion.
-Original Message-
From: Bob Pierce [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 14, 2007 11:00 AM
To: users@spamassassin.apache.org
Subject: PDFAssassin
Is anybody using the PDFAssassin module from
http://blog.atmail.com/?p=61
I didn't think I saw it talked about on the list yet.