You should uninstall (purge) v3 first. Then build the v4 from scratch. On Tue, Oct 16, 2018 at 12:23 PM Vinod Gattani <vinodgattani1...@gmail.com> wrote:
> Robert/ Zdenko > > Yes, in the log I see version "3.4v". > > To install v4, I used the link "https://github.com/tesseract-ocr/tesseract". > I thought it has tesseract v4, as the Readme file say "Source code for the > new LSTM based 4.0 version is available from the master branch on GitHub." > So, I did a git pull. > > Steps: > > > 1. git pull https://github.com/tesseract-ocr/tesseract > 2. cd tesseract > 3. sudo apt-get install libicu-dev > 4. sudo apt-get install libpango1.0-dev > 5. sudo apt-get install libcairo2-dev > 6. sh autogen.sh > 7. sh ./configure > 8. make > 9. make training > 10. sudo make training-install > 11. Training Command gives the error as mentioned. > > Also, when I do tesseract -v, I see 3.04.01 too. > > So, is there any other way of installing v4.0. Please let me know what I > am doing wrong. > > Regards, > Vinod > > On Tue, Oct 16, 2018 at 12:15 PM Zdenko Podobny <zde...@gmail.com> wrote: > >> Robert is pointing you to right direction. Did you read the log you post >> here? >> " Tesseract Open Source OCR Engine v3.04.01 with Leptonica" >> You are mixing tesseract versions so no surprise of problems. >> >> Zdenko >> >> >> ut 16. 10. 2018 o 8:26 Vinod Gattani <vinodgattani1...@gmail.com> >> napísal(a): >> >>> Hi, >>> Typo: " Why the version is not 4.0.? >>> I installed using "git pull https://github.com/tesseract-ocr/tesseract". >>> And then followed the instructions on training page. >>> >>> Regards >>> >>> On Tue, Oct 16, 2018 at 11:53 AM Robert Kamiński < >>> kaminski.robert...@gmail.com> wrote: >>> >>>> Hi, >>>> " Why the version is 4.0." What do you mean by that? In logs it states >>>> that it's 3.04v. "Tesseract Open Source OCR Engine v3.04.01 with >>>> Leptonica". >>>> The problem might be the fact that 4th version is using lstm files >>>> whereas you have version 3.04 using box files instead. Try to check the >>>> version of installed Tesseract. Also note that I'm not the expert here ^.^ >>>> >>>> >>>> wt., 16 paź 2018 o 08:04 Vinod Gattani <vinodgattani1...@gmail.com> >>>> napisał(a): >>>> >>>>> Hi All, >>>>> >>>>> I have started a project to do OCR on Identity Cards. I am learning to >>>>> train tesseract models with custom fonts. >>>>> >>>>> Please help me on this. >>>>> >>>>> Steps till now: >>>>> >>>>> 1. git pull https://github.com/tesseract-ocr/tesseract >>>>> 2. Then I followed instructions on training package till command "sudo >>>>> make training-install". >>>>> 3.Downloaded eng.traineddata from >>>>> https://github.com/tesseract-ocr/tessdata_best in tessdata folder >>>>> 4. Command " src/training/tesstrain.sh --fonts_dir /usr/share/fonts >>>>> --fontlist "Arial Bold" --lang eng --linedata_only >>>>> --noextract_font_properties --langdata_dir ../langdata --tessdata_dir >>>>> ./tessdata --output_dir ~/tesstutorial/engtrain" >>>>> >>>>> It is giving error: >>>>> === Phase E: Generating lstmf files === >>>>> Using TESSDATA_PREFIX=./tessdata >>>>> [Tue Oct 16 05:41:31 UTC 2018] /usr/bin/tesseract >>>>> /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.tif >>>>> /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0 --psm 6 lstm.train >>>>> Tesseract Open Source OCR Engine v3.04.01 with Leptonica >>>>> fseek(data_file_, static_cast<size_t>(offset_table_[tessdata_type]), >>>>> SEEK_SET) == 0:Error:Assert failed:in file ../ccutil/tessdatamanager.h, >>>>> line 173 >>>>> ERROR: /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.lstmf does not exist or >>>>> is not readable >>>>> >>>>> Why the version is 4.0. >>>>> >>>>> Also, how do we download custom font for my Identity Cards. >>>>> >>>>> Regards, >>>>> >>>>> On Monday, 10 September 2018 15:05:15 UTC+5:30, kaminski....@gmail.com >>>>> wrote: >>>>>> >>>>>> Thank you Shreeshrii for reply! >>>>>> >>>>>> Manual customization of theese files might be kinda annoying. If i >>>>>> will need to experiment with the dawg files and I'll achieve something >>>>>> I'll >>>>>> surely let you know if there is any difference. Again thank you for your >>>>>> help and time :) >>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>>>> To post to this group, send email to tesseract-ocr@googlegroups.com. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>>> To post to this group, send email to tesseract-ocr@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/CALtwN-eGJG3MOTm7f-p%3DESRGgU7PtC41SVcBU8OSNMGThYjo5A%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/CALtwN-eGJG3MOTm7f-p%3DESRGgU7PtC41SVcBU8OSNMGThYjo5A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>> To post to this group, send email to tesseract-ocr@googlegroups.com. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/CAN557awfgH5F07nyV5iL1o5pN4MfebOvUWsJBLdSbG6QsdCmew%40mail.gmail.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/CAN557awfgH5F07nyV5iL1o5pN4MfebOvUWsJBLdSbG6QsdCmew%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To post to this group, send email to tesseract-ocr@googlegroups.com. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxAd4YCEUwnU-bPf9FQ%2BtutmKdwSQXro_eo6cjLkNRHA%40mail.gmail.com >> <https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxAd4YCEUwnU-bPf9FQ%2BtutmKdwSQXro_eo6cjLkNRHA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAN557awW6ZeHtsXH0uO8AF8QvhEcHjU74w_ycrN-imoHZTvQew%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAN557awW6ZeHtsXH0uO8AF8QvhEcHjU74w_ycrN-imoHZTvQew%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- Regards, Soumik Ranjan Dasgupta -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAB_aDAf-AQ7eknp86PBqAvZJMGFOZ5ZM3S_kN7O7FKm9JX219Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.