See https://github.com/tesseract-ocr/tesseract/wiki/Fonts
On Sun 22 Jul, 2018, 8:20 PM Jennil Thiyam, <thiyamjen...@gmail.com> wrote: > you guys help me...now there is no error, but i don't know about the > fonts, i try to train the bengali in "lohit-bengali" font thinking its > already in the FONTS folder, but i got > > === Starting training for language 'ben' > [Sun Jul 22 10:48:33 EDT 2018] /usr/bin/text2image > --fonts_dir=/usr/share/fonts/truetype --font=“lohit-bengali” > --outputbase=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt > --text=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt > --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI > Could not find font named “lohit-bengali”. > Pango suggested font FreeMono. > Please correct --font arg. > > === Phase I: Generating training images === > Rendering using “lohit-bengali” > [Sun Jul 22 10:48:34 EDT 2018] /usr/bin/text2image > --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI > --fonts_dir=/usr/share/fonts/truetype --strip_unrenderable_words > --leading=32 --char_spacing=0.0 --exposure=0 > --outputbase=/tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0 --max_pages=3 > --font=“lohit-bengali” > --text=/home/jennil/Desktop/pro/langdata-master/ben/ben.training_text > Could not find font named “lohit-bengali”. > Pango suggested font FreeMono. > Please correct --font arg. > ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not exist > or is not readable > ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not exist > or is not readable > > SO , please tell is all the fonts which are in this FONTS folder are > already installed to tesseract or not? > > > On Sun, Jul 22, 2018 at 7:15 AM, Jennil Thiyam <thiyamjen...@gmail.com> > wrote: > >> Oh sorry for the mistake...I put two dashes, still it says unrecognised.. >> >> On Sun 22 Jul, 2018, 4:27 PM Shree Devi Kumar, <shreesh...@gmail.com> >> wrote: >> >>> needs two dashes, >>> >>> On Sun, Jul 22, 2018 at 12:29 PM <thiyamjen...@gmail.com> wrote: >>> >>>> hello again, i modified the error in the way you said and there is no >>>> error. but now the same error of unrecognised is occured in output_dir. >>>> the error is >>>> ERROR: Unrecognized argument -–output_dir >>>> >>>> my command is >>>> >>>> /usr/share/tesseract-ocr/./tesstrain.sh \ >>>> >>>> --fonts_dir /usr/share/fonts \ >>>> >>>> --lang ben \ >>>> >>>> --linedata_only \ >>>> >>>> --noextract_font_properties \ >>>> >>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben \ >>>> >>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata \ >>>> >>>> -–output_dir /home/jennil/Desktop/pro/output/ben_output \ >>>> >>>> --fontlist “Lohit Bengali” >>>> >>>> >>>> please do help >>>> >>>> On Saturday, July 21, 2018 at 1:42:41 PM UTC-4, shree wrote: >>>>> >>>>> --linedata_only\ >>>>> >>>>> You need space before the continuation mark \ >>>>> >>>>> On Sat 21 Jul, 2018, 10:00 PM , <thiyam...@gmail.com> wrote: >>>>> >>>>>> can u please point out the place where to put the space >>>>>> >>>>>> thank you >>>>>> >>>>>> On Saturday, July 21, 2018 at 12:12:22 PM UTC-4, thiyam...@gmail.com >>>>>> wrote: >>>>>>> >>>>>>> My command is >>>>>>> >>>>>>> >>>>>>> usr/share/tesseract-ocr/./tesstrain.sh \ >>>>>>> >>>>>>> --fonts_dir /usr/share/fonts \ >>>>>>> >>>>>>> --lang ben \ >>>>>>> >>>>>>> --linedata_only\ >>>>>>> >>>>>>> --noextract_font_properties \ >>>>>>> >>>>>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben\ >>>>>>> >>>>>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata –output_dir >>>>>>> /home/jennil/Desktop/pro/output/ben_output\ >>>>>>> >>>>>>> --fontlist “Lohit Bengali” >>>>>>> >>>>>>> >>>>>>> >>>>>>> and here is the error >>>>>>> >>>>>>> >>>>>>> >>>>>>> *ERROR: Unrecognized argument >>>>>>> --linedata_only--noextract_font_properties* >>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>>> To post to this group, send email to tesseract-ocr@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5%40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>> To post to this group, send email to tesseract-ocr@googlegroups.com. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp1Oi0LD69uBc2g%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp1Oi0LD69uBc2g%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXGxBoxwOH1sf6WgAPEY-hwBJoJ75bEHzPbU7GKrobUNA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.