On Thu, 21 Dec 2023, 15:22 Art Rhyno, <artrh...@uwindsor.ca> wrote: > If >
Important extra note (as I see a new image that's white text on black background): Tesseract was trained on black text on white background, targeting books, publications and academic papers' OCRing. To improve your chances, ALWAYS make sure your text (letters) are black (or very! dark grey at least) and your background is white. That is what the engine was trained on and for and thus black text on white background is what you should strive for in your images which you intend to feed to tesseract. (See also notes in my email response in another thread in here just a few minutes ago. It in the documentation, but when you dont realize what youre reading there, this is the main thing to check and ensure to have: - text is black - background is white - greyscale image is fine and possibly better for the added edge detail, but you invariably aim for something that looks as close as possible to "black print on white paper". This the shown output (processed) image should be inverted to match the above conditions. HTH > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAFP60fp-xZUy5701UqGh4UV-TdoHobV3nV%2BV1zYTizxoGF4muQ%40mail.gmail.com.