Matus, >>Sure the OCR results are not very precise. But could we imagine that >>they are pushed in a part of the message that will not go through Bayes? > where do you want to push the ORC'ed test, if not back to SA to check other > rules like bayes?
To a part that would do regexp rules, but not Bayes? I don't know if it is possible. > the PDF is technically something different: PDF (often) contains plain text, > that does not have to be OCRed and this it will not be misinterpreted. But isn't it troubling the Bayes process if we inject the mail body + the part extracted from PDF? Should we not better submit only the original message? I have no answer on that. > I would skip gocr and ocrad, since tesseract behaves great now... > (the debian fuzzyocr package requires all of them, dunno why) I'll take your advice, I jus noticed that tesseract was not enabled by default! I use FreeBSD, could it be required at install only, but disabled later in your configuration of FuzzyOcr? Best regards, Olivier --