I am using the latest version (from the github). <https://lh3.googleusercontent.com/-Ne9c4xgkQLQ/WmCBgmqvKGI/AAAAAAAACQo/9ew6gf62RMcdNX_-YpG4K0qt0J26U4fMgCLcBGAs/s1600/Screenshot%2Bfrom%2B2018-01-18%2B16-41-37.png>
On Sunday, January 14, 2018 at 12:31:17 PM UTC+5:30, Sumedhe Dissanayake wrote: > > I tried lstmtraining with sinhala language but I always get this error. > > Command: > > lstmtraining --traineddata ~/tesstutorial/sintrain/sin/sin.traineddata \ > --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c155]' > \ > --debug_interval 0 --max_iterations 500000 --max_image_MB 60000 > --learning_rate > 20e-4 \ > --model_output ~/tesstutorial/sinoutput/base \ > -U ~/tesstutorial/sintrain/sin/sin.unicharset \ > --traineddata ~/tesstutorial/sintrain/sin/sin.traineddata \ > --train_listfile ~/tesstutorial/sintrain/sin.training_files.txt > > > Error: > Can't encode transcription: 'වැනි නිර්භීත දැන් පියඹා මෙන්ම හා' in > language '' > > > > > > <https://lh3.googleusercontent.com/-OI3Fa2QpWgk/WllqKRXYOBI/AAAAAAAAB1g/6gGg9l6txgItGlpGaAfPa4sNKfHYgL75QCLcBGAs/s1600/Screenshot%2Bfrom%2B2018-01-09%2B21-29-43.png> > > I tried with english language also, It worked well with english. > > How to resolve this issue? > > Platform: > Linux Ubuntu 16.04 LTS > > Tesseract Version: > tesseract 4.00.00alpha > leptonica-1.74.4 > libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib > 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0 > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d36ef019-19c7-425b-98df-4b99ef21f199%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.