https://groups.google.com/forum/#!topic/tesseract-ocr/e3lqpY0pMpw https://groups.google.com/forum/#!topic/tesseract-ocr/UidqCx6OE0Q https://github.com/OpenGreekAndLatin/greek-dev/wiki/uzn-format https://github.com/jsoma/tesseract-uzn ...
PS: I hope it works with tesseract 4 too ;-) I did not tested it yet, but Zdenko št 31. 1. 2019 o 23:34 George Varghese <geoj...@gmail.com> napísal(a): > I am using tesseract v4.0.0.20181030 , leptonica -1.76.0 > > in short - using command line to convert a .tiff format to .txt file - no > loop or any custom solution used. Yes the first 30 lines have the same > location and I am specifying to OCR only my first page > > you mentioned about usage of unz file - I am not aware of such a config > -c parameter. > > Appreciate if you can give me link to any documentation > > > > On Wednesday, January 30, 2019 at 11:34:42 AM UTC-8, George Varghese wrote: >> >> I am using tesseract v4 to convert .tiff file to text, only the first >> page. The script - run from command line on Windows 2012 takes almost 8 >> seconds to convert only the first page. using the configuration. The cpu >> usage also shoots up to 80 % during that time >> >> -c tessedit_page_number=1 >> >> In reality, I only want to convert the first 30 lines to a text file >> output. >> >> Are there any config option to only look at the first 30 lines of the >> .tiff file and any other parameters which will decrease the cpu usage. It >> is ok , even it takes 15 seconds to run OCR conversion but not get this >> CPU spike. >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/34ed1c81-c301-4c65-8baa-12682200b71b%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/34ed1c81-c301-4c65-8baa-12682200b71b%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zv2zUtDzbufCwTyFFPM424%3DX3GA8AOVWWHv%3DHD8UEkbw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.