subject:"Re\: \[tesseract\-ocr\] Experiment with Thai language"

Re: [tesseract-ocr] Experiment with Thai language

2018-08-31 Thread Shree Devi Kumar

>Can't encode transcription: 'คุย เดีย ระบบ๑๙ 77 และมี." มิเมือง' in language '' I don't know what causes this kind of warning and how to solve it so I just continue the training. These are related to normalization and validation of the training text. Please see https://github.com/tesseract-ocr/te

Re: [tesseract-ocr] Experiment with Thai language

2018-08-31 Thread sanparith marukatat

Thanks :) On Friday, August 31, 2018 at 3:29:21 PM UTC+7, shree wrote: > > A few points to note: > > 1. langdata repo has training data for 3.04. please use langdata_lstm repo > for training data for LSTM training. > > 2. To train from existing models, you need to use traineddata files from > te