See https://github.com/tesseract-ocr/tesseract/wiki/Fonts

On Sun 22 Jul, 2018, 8:20 PM Jennil Thiyam, <thiyamjen...@gmail.com> wrote:

> you guys help me...now there is no error, but i don't know about the
> fonts, i try to train the bengali in "lohit-bengali" font thinking its
> already in the FONTS folder, but i got
>
> === Starting training for language 'ben'
> [Sun Jul 22 10:48:33 EDT 2018] /usr/bin/text2image
> --fonts_dir=/usr/share/fonts/truetype --font=“lohit-bengali”
> --outputbase=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt
> --text=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt
> --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI
> Could not find font named “lohit-bengali”.
> Pango suggested font FreeMono.
> Please correct --font arg.
>
> === Phase I: Generating training images ===
> Rendering using “lohit-bengali”
> [Sun Jul 22 10:48:34 EDT 2018] /usr/bin/text2image
> --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI
> --fonts_dir=/usr/share/fonts/truetype --strip_unrenderable_words
> --leading=32 --char_spacing=0.0 --exposure=0
> --outputbase=/tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0 --max_pages=3
> --font=“lohit-bengali”
> --text=/home/jennil/Desktop/pro/langdata-master/ben/ben.training_text
> Could not find font named “lohit-bengali”.
> Pango suggested font FreeMono.
> Please correct --font arg.
> ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not exist
> or is not readable
> ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not exist
> or is not readable
>
> SO , please tell is all the fonts which are in this FONTS folder are
> already installed to tesseract or not?
>
>
> On Sun, Jul 22, 2018 at 7:15 AM, Jennil Thiyam <thiyamjen...@gmail.com>
> wrote:
>
>> Oh sorry for the mistake...I put two dashes, still it says unrecognised..
>>
>> On Sun 22 Jul, 2018, 4:27 PM Shree Devi Kumar, <shreesh...@gmail.com>
>> wrote:
>>
>>> needs two dashes,
>>>
>>> On Sun, Jul 22, 2018 at 12:29 PM <thiyamjen...@gmail.com> wrote:
>>>
>>>> hello again, i modified the error in the way you said and there is no
>>>> error. but now the same error of unrecognised is occured in output_dir.
>>>> the error is
>>>> ERROR: Unrecognized argument -–output_dir
>>>>
>>>> my command is
>>>>
>>>> /usr/share/tesseract-ocr/./tesstrain.sh \
>>>>
>>>> --fonts_dir /usr/share/fonts \
>>>>
>>>> --lang ben \
>>>>
>>>> --linedata_only \
>>>>
>>>> --noextract_font_properties \
>>>>
>>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben \
>>>>
>>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata \
>>>>
>>>> -–output_dir /home/jennil/Desktop/pro/output/ben_output \
>>>>
>>>> --fontlist “Lohit Bengali”
>>>>
>>>>
>>>> please do help
>>>>
>>>> On Saturday, July 21, 2018 at 1:42:41 PM UTC-4, shree wrote:
>>>>>
>>>>> --linedata_only\
>>>>>
>>>>> You need space before the continuation mark \
>>>>>
>>>>> On Sat 21 Jul, 2018, 10:00 PM , <thiyam...@gmail.com> wrote:
>>>>>
>>>>>> can u please point out the place where to put the space
>>>>>>
>>>>>> thank you
>>>>>>
>>>>>> On Saturday, July 21, 2018 at 12:12:22 PM UTC-4, thiyam...@gmail.com
>>>>>> wrote:
>>>>>>>
>>>>>>> My command is
>>>>>>>
>>>>>>>
>>>>>>> usr/share/tesseract-ocr/./tesstrain.sh \
>>>>>>>
>>>>>>> --fonts_dir /usr/share/fonts \
>>>>>>>
>>>>>>> --lang ben \
>>>>>>>
>>>>>>> --linedata_only\
>>>>>>>
>>>>>>> --noextract_font_properties \
>>>>>>>
>>>>>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben\
>>>>>>>
>>>>>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata –output_dir
>>>>>>> /home/jennil/Desktop/pro/output/ben_output\
>>>>>>>
>>>>>>> --fontlist “Lohit Bengali”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> and here is the error
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *ERROR: Unrecognized argument
>>>>>>> --linedata_only--noextract_font_properties*
>>>>>>>
>>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to tesseract-oc...@googlegroups.com.
>>>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to tesseract-ocr+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>>
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-ocr+unsubscr...@googlegroups.com.
>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp1Oi0LD69uBc2g%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp1Oi0LD69uBc2g%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXGxBoxwOH1sf6WgAPEY-hwBJoJ75bEHzPbU7GKrobUNA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to