search issue tracker for "speed"... Zdenko
st 30. 1. 2019 o 7:51 Jan Pohanka <[email protected]> napísal(a): > It is 4.0. I'm satisfied with recognition results, but I need to make it > faster (at constant times below 1s)... > > Dne středa 30. ledna 2019 7:48:23 UTC+1 zdenop napsal(a): >> >> What is your tesseract version? >> >> Zdenko >> >> >> ut 29. 1. 2019 o 20:40 Jan Pohanka <[email protected]> napísal(a): >> >>> Thanks for suggestions. You are right that I'm reffering to >>> api.GetUTF8Text() >>> call, it is my bottleneck. >>> I was not aware that there is some fast and best models in tesseract >>> 4.0, I will give it a try. So far I used just lang=eng or osd. >>> For me it is suspicious that the calls are getting longer during the >>> time. Or to be more precise, first 10-15 calls are up to 500ms and latter >>> ones rise above 1s... >>> SetSourceResolution outside of the loop gives no change unfortunately. >>> >>> BR >>> Jan >>> >>> Dne úterý 29. ledna 2019 18:08:49 UTC+1 Lorenzo Blz napsal(a): >>>> >>>> >>>> First double check if the Pi is not throttling due to overheating or >>>> lack of USB power. This may cause the slowdown. >>>> >>>> Usually 30/50 px of text height is fine. IF the problem is tesseract, >>>> try to use the fast model (or "normal" if using best). I assume you are >>>> using the 4.x release. >>>> >>>> Try tesseract -v to see if you are using all the available CPU >>>> optimizations. >>>> >>>> Try to move the SetSourceResolution outside the loop and see if it >>>> changes something (MAYBE it may invalidate some caches or something). >>>> >>>> The time you are referring to is one single api.GetUTF8Text() call, >>>> correct? >>>> >>>> >>>> Lorenzo >>>> >>>> >>>> Il giorno mar 29 gen 2019 alle ore 17:48 Jan Pohanka <[email protected]> >>>> ha scritto: >>>> >>>>> Hello, >>>>> >>>>> I'm making a simple device used to recognize numbers on pictures taken >>>>> by a webcam. All is running on raspberry pi 3. >>>>> Everything is like following simple loop (in python for simplicity, >>>>> but using C++ api it is the same), images are preprocessed to black and >>>>> white >>>>> >>>>> api = PyTessBaseAPI(psm=tesserocr.PSM.SINGLE_WORD) >>>>> >>>>> for im in images: >>>>> api.SetImage(im) >>>>> api.SetSourceResolution(70) >>>>> ot = api.GetUTF8Text() >>>>> >>>>> api.End() >>>>> >>>>> >>>>> My problem is that api.GetUTF8Text() call is quite slow and more over >>>>> it is getting slower and slower over time. Is there any options how to >>>>> make >>>>> recognition faster? I have tried to resize the image to around 50x10px. >>>>> The >>>>> times starts on around 300ms but then goes up to above 1s which is too >>>>> slow >>>>> for me. I tried both legacy and LSTM algorithms, but they are similar. >>>>> >>>>> best regards >>>>> Jan >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/dedb7fd8-d61e-42bb-a492-34beaa8b1514%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/dedb7fd8-d61e-42bb-a492-34beaa8b1514%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xJGEqU5ARtJOOcXq72Cdn50%2B%2Bi3dqpEfiKK6i0iJeY7Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

