Ruud,

I experience the same issue you describe but after looking at an output 
file in a hex editor the reason is clear. Tesseract seems to determine line 
feeds perferctly fine but it only inserts the Line Feed character (0x0A) 
and not the carriage return character that a windows text file expects. 
(0x0D 0x0A)

So it would be fairly easy to take the output from tesseract and then feed 
it through another converter that changes all the 0x0A characters to 0x0D 
0x0A.  But it is unfortunate that it does not support such an option 
inherently.

On Friday, April 5, 2013 at 4:20:21 PM UTC-5, Ruud van Houtum wrote:
>
> Hello,
>
> I am using Tesseract to output text files from scanned documents. 
> All text images contain typed text and are fairly clear/clean. So far 
> Tesseract has a pretty good accuracy and I am quite content.
>
> However Tesseract doesn't seem to recognize line breaks, and I was 
> wondering if this is an available option or not?
> At first I thought this is not possible but searching online brings me 
> topics (such as: 
> http://code.google.com/p/tesseract-ocr/issues/detail?id=575) which seem 
> to show that it should be possible.
>
> Is there a parameter that should be included in the command prompt?
> I am using Windows 7, cmd.exe.
>
> Thanks in advance,
> R
>
>
> BTW I would recommend adding 
> http://tesseract-ocr.googlecode.com/svn-history/trunk/doc/tesseract.1.html 
> to the wiki page, it took me very long to find this page (its hidden in the 
> FAQ) and it provides some helpful information about the parameters.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/da5f391a-9eee-46c3-97cd-c90afd7c643e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to