On Fri, Apr 22, 2016 at 12:27 AM, S.J. Becker <scottbecke...@gmail.com>
wrote:

>
> I just did more testing.
>
> My one word or single character image works with
> -psm 7
> -psm 8
>
> my two or three lines of text image works with the default of
> -psm 3
> as well as
> -psm 4
>
> They both seem to work with
> -psm 6
>
> I may have to go with 6 even though my three line test with different
> font sizes should be done with 4 based on it's description.
>
> I feel it's a bug that 3 and 4 can't reliably handle simpler content.
> To get the most out of Tesseract, I must analyze the segmentation?!
>

Why analyze? Don't you know in advance if you are asking to OCR page or
just paragraph, line or word???

>
> That is why I had to go through the trouble of compiling leptonica;
> so that tesseract is smart enough that I don't have to re-invent the wheel.
>

Tesseract use leptonica as dependancy so it does not need to re-invent the
wheel.

>
> It seems that it's failing at the segmentation stage. If it finds nothing
> it could try again automatically with a more primitive setting. That is
> way more efficient than my process spawning tesseract twice as often.
>
>     thanks
>     scott
>
> On Thursday, April 21, 2016 at 4:21:47 AM UTC-7, zdenop wrote:
>>
>> Please read the wiki
>> https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#page-segmentation-method
>>
>> Zdenko
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/e9f5cb1a-374f-49b6-82ef-795b009e0180%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/e9f5cb1a-374f-49b6-82ef-795b009e0180%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8z%3DMXud_HpdEVp-2%2BU%3DpHucH_%3DBSPx1wPSFiseuAmSB2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to