Is it possible to instruct tesseract for the image: Let us build a snow- man on the lawn.
to output in txt format: Let us build a snowman on the lawn. This would almost preserve line breaks, while at the same time making hyphenated words whole and searchable. It seems to me that the source has code to recognize hyphenated words, and it should be possible to implement this behaviour as an option. -- Lars Aronsson (l...@aronsson.se) Project Runeberg - free Nordic literature - http://runeberg.org/ -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2659e698-54b8-38cc-060e-db993aa0a1a6%40aronsson.se.