https://github.com/tesseract-ocr/tesseract/issues/549
@harinath141 <https://github.com/harinath141> If you are getting a lot of these errors during finetune, try replace top layer training. You can use the box/tiff pairs generated for finetune. Commands will be similar to the following: mkdir -p ~/tesstutorial/tellayer_from_tel combine_tessdata -e ../tessdata/tel.traineddata \ ~/tesstutorial/tellayer_from_tel/tel.lstm lstmtraining -U ~/tesstutorial/tel/tel.unicharset \ --script_dir ../langdata --debug_interval 0 \ --continue_from ~/tesstutorial/tellayer_from_tel/tel.lstm \ --append_index 5 --net_spec '[Lfx256 O1c105]' \ --model_output ~/tesstutorial/tellayer_from_tel/tellayer \ --train_listfile ~/tesstutorial/tel/tel.training_files.txt \ --target_error_rate 0.01 I found the article you wrote but --script_dir doesn't work in the lstmtraining ? How do I change this option(flag) ??? what is replaced by that phrase 2018년 3월 13일 화요일 오후 4시 24분 52초 UTC+9, shree 님의 말: > > That info is given in the training wiki page. > > On Tue 13 Mar, 2018, 12:53 PM 이경준, <player...@gmail.com <javascript:>> > wrote: > >> There is no way about replacing top layer ... ㅜㅜ >> >> 2018년 3월 13일 화요일 오후 4시 22분 8초 UTC+9, shree 님의 말: >>> >>> https://github.com/tesseract-ocr/tesseract/issues/1009 >>> >>> Link works ok >>> >>> On Tue 13 Mar, 2018, 12:37 PM 이경준, <player...@gmail.com> wrote: >>> >>>> Shreeshrii <https://github.com/Shreeshrii> commented on 29 Jun 2017 >>>> <https://github.com/tesseract-ocr/tesseract/issues/1012#issuecomment-311892286> >>>> • >>>> edited >>>> >>>> I think this happens when the complex characters in your training text >>>> are not part of the original Korean Unicharset that the 4.00.00alpha >>>> kor.traineddata was trained with. >>>> >>>> Do 'replace top layer' training instead of finetune. @abhishekchopde >>>> <https://github.com/abhishekchopde> has had good results with it - see >>>> #1009 <https://github.com/tesseract-ocr/tesseract/issues/1009> >>>> >>>> It will take longer than finetuning. >>>> >>>> >>>> >>>> Hi shree I have a question ... you uploade this passage . But this link >>>> is not right . plz check again >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To post to this group, send email to tesser...@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/2878cbf6-a064-4fe5-ab5c-cfcd54248e9e%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/2878cbf6-a064-4fe5-ab5c-cfcd54248e9e%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-oc...@googlegroups.com <javascript:>. >> To post to this group, send email to tesser...@googlegroups.com >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/d94d0cc3-79f0-4a6e-9cee-92b616424459%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/d94d0cc3-79f0-4a6e-9cee-92b616424459%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7ba3c6fe-c66d-428d-95ee-aed8e149c6b9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.