[tesseract-ocr] How to do OCR for 96dpi screenshots from computer display with 100% accuracy?

Vadim Melnik Wed, 18 Oct 2023 22:07:29 -0700

Hello,

We are processing screenshot PNG images from computer display with 96dpi 
resolution. This is just B/W images rendered with known truetype single 
font with fixed size, w/o antialiasing or any other subpixel rendering 
things. Picture structure is clear, opaque and pixelated like listed below, 
character or glyph always have the same 2D structure:


[image: screen1.png]

Main goal is simple, fast and efficient (in time and memory terms) OCR of 
these screendumps with 100% accuracy. We tried tesseract 4/5 with old Cubic 
and new LSTM models in default mode with average results (60%), then 
trained both models with custom font and 288-384dpi upscale, unfortunately 
final output is still not good - recognition is
definitely better around 90%+, but not 100% and additional upscaling 
increases memory and processing in 10 times.

Does anyone know if Tesseract provides some kind of configuration or 
functionality to do this kind of OCR? Or may be some other open-source OCRs 
better fit this task, like EasyOCR, OpenCV, OCRopus, GOCR etc..

--
Thanks,
Vadim.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e8e50184-c64d-4808-85e4-7948c93e261bn%40googlegroups.com.

[tesseract-ocr] How to do OCR for 96dpi screenshots from computer display with 100% accuracy?

Reply via email to