Do a threshold (otsu), count the white and black pixels, this will tell you if you have white text on dark background or the opposite. If necessary, negate the image so to have a dark text on bright background.
The images are very small, you want al least 35/50px. Try to have them larger if possible otherwise upscale might help. Sharpening or other steps might help before the threshold. Lorenzo Il giorno dom 23 feb 2020 alle ore 12:29 Jonathan Dahan <jdaha...@gmail.com> ha scritto: > Hi, I would love to know which type of custom preprocessing these images > would go through in order to be the best to successfully read them. Note > that the filter/preprocess needs to be generic which means to work across > all four images. > > https://i.stack.imgur.com/4N62x.png > https://i.stack.imgur.com/3upPM.png > https://i.stack.imgur.com/WcrGU.png > https://i.stack.imgur.com/ymhv6.png > > Thanks. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/05373fc0-d380-4aaa-a2ef-6d9538601ab0%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/05373fc0-d380-4aaa-a2ef-6d9538601ab0%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLw_02P2TDSP2x%2BJM8mRgGwQA3uLj%3Dryh4DSbFLvn2jKkQ%40mail.gmail.com.