you dont need this

I see, I stand corrected!

maybe ask how to configure extracttext ?

On 17.03.23 13:59, Michael Grant via users wrote:
Sure, I'd be happy to see some examples.  The man page looks pretty
straight forward.

I use exactly what's in the docs and it seems to work.

I have added for debugging:

add_header      all     ExtractText-Chars       _EXTRACTTEXTCHARS_
add_header      all     ExtractText-Words       _EXTRACTTEXTWORDS_
add_header      all     ExtractText-Tools       _EXTRACTTEXTTOOLS_
add_header      all     ExtractText-Types       _EXTRACTTEXTTYPES_
add_header      all     ExtractText-Extensions  _EXTRACTTEXTEXTENSIONS_
add_header      all     ExtractText-Flags       _EXTRACTTEXTFLAGS_

(I use spamass-milter so these headers don't appear in the incoming mail, only when I feet it to SA)

I see it depends on some external tools like tesseract and odt2txt so
I had better install those first.

I have not had good luck with tesseract out of the box, I wonder if
there's some options to tune it to make it work better.  Is there
anything better?

I have looked at gocr/ocrad/tesseract >15 years ago, at that time gocr seemed to be the best alternative.
Since then, google started sponsoring tesseract and it seems to be the best.
you just need to install scripts and language files for it.

To see how well this is working, I am hoping to be able to see the
output of these tools with -D so I can write some rules.

Similarly, is there a way to see the 'body' text that is fed into the
rules?  I don't see that in the output of -D.  By 'body', I mean the
text with the html cleaned out of it plus the subject line.  I have a
message and I want to write a new body rule, I want to see what
spamassassin is using as the 'body' so I can write the regex.  I don't
see the body text in -D.

no idea here

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I intend to live forever - so far so good.

Reply via email to