So: 1. If you have a problem - use example data (ocrd-testset.zip) or provide your data set for reproducing the problem 2. make sure you use the latest version of tesstrain 3. ' *make training' *does not produce the output you presented. Provide real steps for reproducing the problem, if you are interested in help.
Zdenko st 29. 5. 2024 o 15:45 Duy Hoàng <duynguyen....@gmail.com> napísal(a): > I'm creating a training file on windows based on the instructions here: > https://github.com/tesseract-ocr/tesstrain/ > > I'am using tesseract ocr version 5.3.4 > Can someone help me with this case > > $ *make training* > You are using make version: 4.4.1 > unicharset_extractor --output_unicharset "data/korletter/unicharset" > --norm_mode 2 "data/korletter/all-gt" > Extracting unicharset from plain text file data/korletter/all-gt > Wrote unicharset file data/korletter/unicharset > python shuffle.py 0 "data/korletter/all-lstmf" > python generate_eval_train.py data/korletter/all-lstmf 0.90 > dos2unix "data/korletter/korletter.numbers" > dos2unix: data/korletter/korletter.numbers: No such file or directory > dos2unix: Skipping data/korletter/korletter.numbers, not a regular file. > make: [Makefile:290: data/korletter/korletter.traineddata] Error 2 > (ignored) > dos2unix "data/korletter/korletter.punc" > dos2unix: data/korletter/korletter.punc: No such file or directory > dos2unix: Skipping data/korletter/korletter.punc, not a regular file. > make: [Makefile:291: data/korletter/korletter.traineddata] Error 2 > (ignored) > dos2unix "data/korletter/korletter.wordlist" > dos2unix: data/korletter/korletter.wordlist: No such file or directory > dos2unix: Skipping data/korletter/korletter.wordlist, not a regular file. > make: [Makefile:292: data/korletter/korletter.traineddata] Error 2 > (ignored) > dos2unix "data/langdata/korletter/korletter.config" > dos2unix: data/langdata/korletter/korletter.config: No such file or > directory > dos2unix: Skipping data/langdata/korletter/korletter.config, not a regular > file. > make: [Makefile:293: data/korletter/korletter.traineddata] Error 2 > (ignored) > combine_lang_model \ > --input_unicharset data/korletter/unicharset \ > --script_dir data/langdata \ > --numbers data/korletter/korletter.numbers \ > --puncs data/korletter/korletter.punc \ > --words data/korletter/korletter.wordlist \ > --output_dir data \ > \ > --lang korletter > Failed to read data from: data/korletter/korletter.wordlist > Failed to read data from: data/korletter/korletter.punc > Failed to read data from: data/korletter/korletter.numbers > Loaded unicharset of size 4 from file data/korletter/unicharset > Setting unichar properties > Setting script properties > Config file is optional, continuing... > Failed to read data from: data/langdata/korletter/korletter.config > Null char=2 > Created data/korletter/korletter.traineddata > lstmtraining \ > --debug_interval 0 \ > --traineddata data/korletter/korletter.traineddata \ > --learning_rate 0.002 \ > --net_spec "[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx192 O1c4]" \ > --model_output data/korletter/checkpoints/korletter \ > --train_listfile data/korletter/list.train \ > --eval_listfile data/korletter/list.eval \ > --max_iterations 10000 \ > --target_error_rate 0.01 \ > 2>&1 | tee -a data/korletter/training.log > Failed to load list of training filenames from data/korletter/list.train > > lstmtraining \ > --stop_training \ > --continue_from data/korletter/checkpoints/korletter_checkpoint \ > --traineddata data/korletter/korletter.traineddata \ > --model_output data/korletter.traineddata > Failed to read continue from: > data/korletter/checkpoints/korletter_checkpoint > make: *** [Makefile:347: data/korletter.traineddata] Error 1 > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/397d129c-0e61-4003-9cb4-c6b7f8a615a8n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/397d129c-0e61-4003-9cb4-c6b7f8a615a8n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yeCwyuVbQ2wyx8FJz34c%3DNM%3Ds-OCK6v6udORs4K_N0zQ%40mail.gmail.com.