It is 4.0. I'm satisfied with recognition results, but I need to make it faster (at constant times below 1s)...
Dne středa 30. ledna 2019 7:48:23 UTC+1 zdenop napsal(a): > > What is your tesseract version? > > Zdenko > > > ut 29. 1. 2019 o 20:40 Jan Pohanka <[email protected] <javascript:>> > napísal(a): > >> Thanks for suggestions. You are right that I'm reffering to >> api.GetUTF8Text() >> call, it is my bottleneck. >> I was not aware that there is some fast and best models in tesseract 4.0, >> I will give it a try. So far I used just lang=eng or osd. >> For me it is suspicious that the calls are getting longer during the >> time. Or to be more precise, first 10-15 calls are up to 500ms and latter >> ones rise above 1s... >> SetSourceResolution outside of the loop gives no change unfortunately. >> >> BR >> Jan >> >> Dne úterý 29. ledna 2019 18:08:49 UTC+1 Lorenzo Blz napsal(a): >>> >>> >>> First double check if the Pi is not throttling due to overheating or >>> lack of USB power. This may cause the slowdown. >>> >>> Usually 30/50 px of text height is fine. IF the problem is tesseract, >>> try to use the fast model (or "normal" if using best). I assume you are >>> using the 4.x release. >>> >>> Try tesseract -v to see if you are using all the available CPU >>> optimizations. >>> >>> Try to move the SetSourceResolution outside the loop and see if it >>> changes something (MAYBE it may invalidate some caches or something). >>> >>> The time you are referring to is one single api.GetUTF8Text() call, >>> correct? >>> >>> >>> Lorenzo >>> >>> >>> Il giorno mar 29 gen 2019 alle ore 17:48 Jan Pohanka <[email protected]> >>> ha scritto: >>> >>>> Hello, >>>> >>>> I'm making a simple device used to recognize numbers on pictures taken >>>> by a webcam. All is running on raspberry pi 3. >>>> Everything is like following simple loop (in python for simplicity, but >>>> using C++ api it is the same), images are preprocessed to black and white >>>> >>>> api = PyTessBaseAPI(psm=tesserocr.PSM.SINGLE_WORD) >>>> >>>> for im in images: >>>> api.SetImage(im) >>>> api.SetSourceResolution(70) >>>> ot = api.GetUTF8Text() >>>> >>>> api.End() >>>> >>>> >>>> My problem is that api.GetUTF8Text() call is quite slow and more over >>>> it is getting slower and slower over time. Is there any options how to >>>> make >>>> recognition faster? I have tried to resize the image to around 50x10px. >>>> The >>>> times starts on around 300ms but then goes up to above 1s which is too >>>> slow >>>> for me. I tried both legacy and LSTM algorithms, but they are similar. >>>> >>>> best regards >>>> Jan >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/a53b4b25-97e3-47dc-823a-cbb219225eed%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/baa59c86-b002-4607-8dda-16835cd3ea73%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/dedb7fd8-d61e-42bb-a492-34beaa8b1514%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

