I have data that comes in from various old (1920) magazines that has multiple blocks of text on a single page. Right now, OCR recognition interprets the text lines across the page so the output is interspersed rather than word-wrapped to the next column. Is there any way to get the OCR scanned text concatenated with one block following the next block? Note- these are not all fixed size columns. I tried all the pagesegmodes but the best I get is interspersed text.
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2d3610a0-45e0-499c-86c2-08cc0ec622c1n%40googlegroups.com.