You need to use a Unicode font. Seems like FMAbhaya is not. http://www.sinhalafonts.org/fonts/13142/fm_abhaya.html
https://github.com/tesseract-ocr/langdata_lstm/blob/master/sin/okfonts.txt lists the fonts used for Tesseract4 alpha On Mon, Sep 23, 2019 at 3:07 PM isuri anuradha <[email protected]> wrote: > As the initial step I used this command to generate the training data.[1]. > rm -rf train/* > tesstrain.sh --fonts_dir fonts \ > --fontlist 'FMAbhaya' \ > --lang sin \ > --linedata_only \ > --langdata_dir langdata_lstm \ > --tessdata_dir tesseract/tessdata \ > --save_box_tiff \ > --maxpages 10 \ > --output_dir train > > I used FMAbhaya font type for the training. But it will prompt error like > [2]. > > [image: abc1.png] > > > > > > > > > Why this kind of error is appearing and what are the solutions to fix this > issue? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/f494cac9-3f45-42c9-bce3-e0ae3e9e09a8%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/f494cac9-3f45-42c9-bce3-e0ae3e9e09a8%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduU7R2p9TNnFKaM_VVq4E5U8U3BmHaJ21TMHWF_8oF_Csw%40mail.gmail.com.

