I use it as follows and it works. Please check that you are using correct paths for the files.
combine_lang_model \ --input_unicharset ./layersan/san.unicharset \ --script_dir ~/langdata \ --words ~/langdata/san/san.wordlist \ --numbers ~/langdata/san/san.numbers \ --puncs ~/langdata/san/san.punc \ --output_dir ./layersan \ --lang san \ --pass_through_recoder \ --version_str ` cat ./layersan/san.new.version` And, here is the unpacking of this traineddata file ~/tesstutorial-deva/layersan/san$ combine_tessdata -u san.traineddata ./san. Extracting tessdata components from san.traineddata Wrote ./san.config Wrote ./san.lstm-punc-dawg Wrote ./san.lstm-word-dawg Wrote ./san.lstm-number-dawg Wrote ./san.lstm-unicharset Wrote ./san.lstm-recoder Wrote ./san.version Version string:4.0.0-beta.4-138-g2093:san:shreeshrii20180917:from:4.00.00alpha:Devanagari:synth20170629test 0:config:size=1013, offset=192 18:lstm-punc-dawg:size=5306, offset=1205 19:lstm-word-dawg:size=15123986, offset=6511 20:lstm-number-dawg:size=450, offset=15130497 21:lstm-unicharset:size=12621, offset=15130947 22:lstm-recoder:size=1552, offset=15143568 23:version:size=92, offset=15145120 On Mon, Sep 17, 2018 at 4:18 PM, Hosein Khoshdel <hoskhosh...@gmail.com> wrote: > i used combine_lang_model like this: > > combine_lang_model --input_unicharset > ../combinelangmodel/fas.lstm-unicharset > \ > --script_dir ../combinelangmodel/sdir \ > --outputdir outputdir \ > --lang fas \ > --lang_is_rtl true \ > --words ..\lists\fas.wordlist \ > --puncs ..\lists\fas.punc \ > --numbers ..\lists\fas.numbers \ > > BTW i get fas.lstm-unicharset by using combine_tessdata with -u on > official fas.traineddata and got fas.wordlist, fas.punc and fas.numbers > from langdata repo. now almost everything is fine except that when i unpack > the resulting traineddata there is no dawg file in it although the help > says that if the 3 word lists are provided the dawg files are also added to > traineddata file. > can you please help me and show me what part i am doing wrong? > also the extra spaces in command is just for better readability here > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/ecb262d7-d448-4125-a60e-ddf266aea40c% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/ecb262d7-d448-4125-a60e-ddf266aea40c%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWq8PCg-VL2cKurCcyO0cKAFr-Gi3hCKYWoxf0An%3DniVA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.