[tesseract-ocr] Text-wrap recognition

Ajg Sun, 18 Aug 2024 09:48:17 -0700

I have data that comes in from various old (1920) magazines that has 
multiple blocks of text on a single page. Right now, OCR recognition 
interprets the text lines across the page so the output is interspersed 
rather than word-wrapped to the next column.  Is there any way to get the 
OCR scanned text concatenated with one block following the next block?  
Note- these are not all fixed size columns.  I tried all the pagesegmodes 
but the best I get is interspersed text.


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2d3610a0-45e0-499c-86c2-08cc0ec622c1n%40googlegroups.com.

[tesseract-ocr] Text-wrap recognition

Reply via email to