i used combine_lang_model like this: combine_lang_model --input_unicharset ../combinelangmodel/fas.lstm-unicharset \ --script_dir ../combinelangmodel/sdir \ --outputdir outputdir \ --lang fas \ --lang_is_rtl true \ --words ..\lists\fas.wordlist \ --puncs ..\lists\fas.punc \ --numbers ..\lists\fas.numbers \
BTW i get fas.lstm-unicharset by using combine_tessdata with -u on official fas.traineddata and got fas.wordlist, fas.punc and fas.numbers from langdata repo. now almost everything is fine except that when i unpack the resulting traineddata there is no dawg file in it although the help says that if the 3 word lists are provided the dawg files are also added to traineddata file. can you please help me and show me what part i am doing wrong? also the extra spaces in command is just for better readability here -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ecb262d7-d448-4125-a60e-ddf266aea40c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.