I'm using Tesseract (version 5.3.1) in Windows to recognize characters from a text that includes special characters like ñüá. Most of these characters are within the Latin script, so I've declared this in the command line.
In this image, the special characters are ñ,Ñ,á,é. [image: text.png] The command line I'm using is * tesseract text.png stdout --psm 6 -l Latin -c tessedit_char_whitelist=aáeéiocfhklmnñtÑ* However, the output text is missing white spaces between words, and the special characters are being completely ignored, resulting in: *aoloaalcalmoo* *okonioniachillalif * Do you know why tesseract is not taking into account the characters I've declared in the whitelist? Maybe I'm not correctly specifying the special characters Any help is greatly appreciated. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/843a1439-45ba-422c-8ba8-40fa557938b3n%40googlegroups.com.