training/lstmeval --model ~/tesstutorial/engoutput/base_checkpoint \ --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \ --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt
training/lstmeval --model tessdata/best/eng.traineddata \ --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt On Thu, 27 Jun 2019, 22:47 Shree Devi Kumar, <shreesh...@gmail.com> wrote: > See > https://github.com/tesseract-ocr/tesseract/blob/master/doc/lstmeval.1.asc > > When using checkpoint you need to also use the starter traineddata file > used for training. > > Or give final traineddata file as model. > > So, if after training u have converted the checkpoint to a traineddata, > you can use that as model. Similarly for the original traineddata. > > On Thu, 27 Jun 2019, 21:46 Arno Loo, <arno.laf...@gmail.com> wrote: > >> Hello, >> >> I just finished my first training of tesseract 4.0 and I ran a lstmeval >> on the generated model, which I named *mod01.* >> I use this command line : >> lstmeval --model data/checkpoints/mod01_checkpoint --traineddata ./usr/ >> share/tessdata/mod01.traineddata --eval_listfile data/list.eval >> >> It worked fine and it gave me a character error rate and a word error >> rate. Now I would like to know if my training improved Tesseract's accuracy >> on my specific documents. So I wanted to launch the evaluation on the same >> dataset but with the model I started the training from, the english >> provided on Tesseract's github repo : eng.traineddata. I tried : >> lstmeval --traineddata ./usr/share/tessdata/eng.traineddata --eval_listfile >> data/list.eval >> But it did not work because I did not provided any --model >> >> And this showed me that my understanding of Tesseract's was not correct. >> Since downloading a new *lang.traineddata* is enough to use Tesseract >> with this lang I thought that all the model was contained in the >> traineddata files. What is this --model argument then ? >> In which my research on the web told me to put the last checkpoint of my >> training but without explaining why. >> >> Is it possible then to run lstmeval on a pretrained model like >> eng.traineddata ? >> >> Thank you ! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To post to this group, send email to tesseract-ocr@googlegroups.com. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/5f762b56-f7b0-4438-a8cb-cbab94304341%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/5f762b56-f7b0-4438-a8cb-cbab94304341%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW9v-cdCyLYGmtdB%2BuxUgeqAktSM3EZbcHVcerbXRSTSg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.