Re: [tesseract-ocr] How to optimize tesseract to maximum speed for single number (several digits) recognition

Jan Pohanka Tue, 29 Jan 2019 11:40:14 -0800

Thanks for suggestions. You are right that I'm reffering to api.GetUTF8Text() 
call, it is my bottleneck.
I was not aware that there is some fast and best models in tesseract 4.0, I 
will give it a try. So far I used just lang=eng or osd.
For me it is suspicious that the calls are getting longer during the time. 
Or to be more precise, first 10-15 calls are up to 500ms and latter ones 
rise above 1s...
SetSourceResolution outside of the loop gives no change unfortunately.


BR
Jan

Dne úterý 29. ledna 2019 18:08:49 UTC+1 Lorenzo Blz napsal(a):
>
>
> First double check if the Pi is not throttling due to overheating or lack 
> of USB power. This may cause the slowdown.
>
> Usually 30/50 px of text height is fine. IF the problem is tesseract, try 
> to use the fast model (or "normal" if using best). I assume you are using 
> the 4.x release.
>
> Try tesseract -v to see if you are using all the available CPU 
> optimizations.
>
> Try to move the SetSourceResolution outside the loop and see if it changes 
> something (MAYBE it may invalidate some caches or something).
>
> The time you are referring to is one single api.GetUTF8Text() call, 
> correct?
>
>
> Lorenzo
>
>
> Il giorno mar 29 gen 2019 alle ore 17:48 Jan Pohanka <[email protected] 
> <javascript:>> ha scritto:
>
>> Hello,
>>
>> I'm making a simple device used to recognize numbers on pictures taken by 
>> a webcam. All is running on raspberry pi 3.
>> Everything is like following simple loop (in python for simplicity, but 
>> using C++ api it is the same), images are preprocessed to black and white
>>
>> api = PyTessBaseAPI(psm=tesserocr.PSM.SINGLE_WORD)
>>
>> for im in images:
>>     api.SetImage(im)
>>     api.SetSourceResolution(70)
>>     ot = api.GetUTF8Text()
>>
>> api.End()
>>
>>
>> My problem is that api.GetUTF8Text() call is quite slow and more over it 
>> is getting slower and slower over time. Is there any options how to make 
>> recognition faster? I have tried to resize the image to around 50x10px. The 
>> times starts on around 300ms but then goes up to above 1s which is too slow 
>> for me. I tried both legacy and LSTM algorithms, but they are similar.
>>
>> best regards
>> Jan
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] How to optimize tesseract to maximum speed for single number (several digits) recognition

Reply via email to