Did you check this? https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=147781&start=50#p972790
Il giorno mer 30 gen 2019 alle ore 08:09 Jan Pohanka <[email protected]> ha scritto: > I have already done that but haven't found anything interesting. > I tried to ask here if there are eg. any part of algorithms that can be > disabled etc. The image is preprocessed, binarized and contain only 8 > digits (and point). I was also a bit surprised that resizing image from > 400px to 50px has given only subtle speed up. > > I will try the fast model today (if I find how to switch it), maybe it > will help. > > here are my measured times > ocr time: 0.980876922607 > ocr time: 0.435426950455 > ocr time: 0.76907491684 > ocr time: 0.836761951447 > ocr time: 0.871710062027 > ocr time: 0.803520917892 > ocr time: 0.371052026749 > ocr time: 0.732284069061 > ocr time: 0.745162010193 > ocr time: 0.836426019669 > ocr time: 0.740739107132 > ocr time: 0.379159927368 > ocr time: 0.798940181732 > ocr time: 0.3972260952 > ocr time: 0.739762067795 > ocr time: 0.7757999897 > ocr time: 0.772871017456 > ocr time: 0.435608863831 > ocr time: 0.770547866821 > ocr time: 0.870738983154 > ocr time: 0.37126493454 > ocr time: 0.837875127792 > ocr time: 0.811723947525 > ocr time: 0.865257024765 > ocr time: 0.79048204422 > ocr time: 0.435704946518 > ocr time: 0.763910055161 > ocr time: 0.391008853912 > ocr time: 0.396636009216 > ocr time: 0.38174700737 > ocr time: 0.809095144272 > ocr time: 0.773195028305 > ocr time: 0.427488088608 > ocr time: 0.403608083725 > ocr time: 0.806233167648 > ocr time: 0.948635101318 > ocr time: 0.900885105133 > ocr time: 0.829130887985 > ocr time: 0.932774782181 > ocr time: 1.09788799286 > ocr time: 0.520708799362 > ocr time: 0.448786973953 > ocr time: 0.560626983643 > ocr time: 0.993177175522 > ocr time: 0.48442697525 > ocr time: 1.1292309761 > ocr time: 1.04695606232 > ocr time: 0.8810338974 > ocr time: 1.10285806656 > ocr time: 1.05213713646 > ocr time: 1.22593903542 > ocr time: 1.04618191719 > ocr time: 1.11645102501 > ocr time: 1.05435395241 > ocr time: 1.15162396431 > ocr time: 0.547721862793 > ocr time: 0.607867956161 > ocr time: 1.14074802399 > ocr time: 1.1790971756 > ocr time: 1.18815803528 > ocr time: 0.58503985405 > ocr time: 1.10898280144 > ocr time: 1.22723913193 > ocr time: 1.2178709507 > ocr time: 1.28540086746 > ocr time: 1.28237104416 > ocr time: 1.56176805496 > ocr time: 1.2859480381 > ocr time: 1.2599170208 > ocr time: 1.42588591576 > ocr time: 1.51333785057 > ocr time: 1.34276986122 > ocr time: 1.34283900261 > ocr time: 1.39351201057 > ocr time: 1.61450195312 > ocr time: 1.44723105431 > ocr time: 1.63176107407 > ocr time: 0.82429599762 > ocr time: 1.08239603043 > ocr time: 0.755813121796 > ocr time: 1.63984704018 > ocr time: 1.84553313255 > ocr time: 0.958009958267 > ocr time: 1.52479290962 > ocr time: 0.919597864151 > > thanks > Jan > > Dne středa 30. ledna 2019 7:57:07 UTC+1 zdenop napsal(a): >> >> search issue tracker for "speed"... >> >> Zdenko >> >> >> st 30. 1. 2019 o 7:51 Jan Pohanka <[email protected]> napísal(a): >> >>> It is 4.0. I'm satisfied with recognition results, but I need to make it >>> faster (at constant times below 1s)... >>> >>> Dne středa 30. ledna 2019 7:48:23 UTC+1 zdenop napsal(a): >>>> >>>> What is your tesseract version? >>>> >>>> Zdenko >>>> >>>> >>>> ut 29. 1. 2019 o 20:40 Jan Pohanka <[email protected]> napísal(a): >>>> >>>>> Thanks for suggestions. You are right that I'm reffering to >>>>> api.GetUTF8Text() >>>>> call, it is my bottleneck. >>>>> I was not aware that there is some fast and best models in tesseract >>>>> 4.0, I will give it a try. So far I used just lang=eng or osd. >>>>> For me it is suspicious that the calls are getting longer during the >>>>> time. Or to be more precise, first 10-15 calls are up to 500ms and latter >>>>> ones rise above 1s... >>>>> SetSourceResolution outside of the loop gives no change unfortunately. >>>>> >>>>> BR >>>>> Jan >>>>> >>>>> Dne úterý 29. ledna 2019 18:08:49 UTC+1 Lorenzo Blz napsal(a): >>>>>> >>>>>> >>>>>> First double check if the Pi is not throttling due to overheating or >>>>>> lack of USB power. This may cause the slowdown. >>>>>> >>>>>> Usually 30/50 px of text height is fine. IF the problem is tesseract, >>>>>> try to use the fast model (or "normal" if using best). I assume you are >>>>>> using the 4.x release. >>>>>> >>>>>> Try tesseract -v to see if you are using all the available CPU >>>>>> optimizations. >>>>>> >>>>>> Try to move the SetSourceResolution outside the loop and see if it >>>>>> changes something (MAYBE it may invalidate some caches or something). >>>>>> >>>>>> The time you are referring to is one single api.GetUTF8Text() call, >>>>>> correct? >>>>>> >>>>>> >>>>>> Lorenzo >>>>>> >>>>>> >>>>>> Il giorno mar 29 gen 2019 alle ore 17:48 Jan Pohanka < >>>>>> [email protected]> ha scritto: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I'm making a simple device used to recognize numbers on pictures >>>>>>> taken by a webcam. All is running on raspberry pi 3. >>>>>>> Everything is like following simple loop (in python for simplicity, >>>>>>> but using C++ api it is the same), images are preprocessed to black and >>>>>>> white >>>>>>> >>>>>>> api = PyTessBaseAPI(psm=tesserocr.PSM.SINGLE_WORD) >>>>>>> >>>>>>> for im in images: >>>>>>> api.SetImage(im) >>>>>>> api.SetSourceResolution(70) >>>>>>> ot = api.GetUTF8Text() >>>>>>> >>>>>>> api.End() >>>>>>> >>>>>>> >>>>>>> My problem is that api.GetUTF8Text() call is quite slow and more >>>>>>> over it is getting slower and slower over time. Is there any options >>>>>>> how to >>>>>>> make recognition faster? I have tried to resize the image to around >>>>>>> 50x10px. The times starts on around 300ms but then goes up to above 1s >>>>>>> which is too slow for me. I tried both legacy and LSTM algorithms, but >>>>>>> they >>>>>>> are similar. >>>>>>> >>>>>>> best regards >>>>>>> Jan >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "tesseract-ocr" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To post to this group, send email to [email protected]. >>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/dedb7fd8-d61e-42bb-a492-34beaa8b1514%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/dedb7fd8-d61e-42bb-a492-34beaa8b1514%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/30831c1e-4bd9-4eae-8118-ad9244949b80%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/30831c1e-4bd9-4eae-8118-ad9244949b80%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLwRkcyASFw%3DdvKtcsdGwgHmLs4xUi%2BWppQvssp_J7a0KA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

