I also noticed that you are using just one font for training, and also using the same font for evaluation.
While probably unrelated to the errors you are getting, lstm training from scratch requires a large number of fonts and training text. You should try fine-tune training to modify current best model for the font you need. On 19-Jan-2018 7:23 AM, "ShreeDevi Kumar" <shreesh...@gmail.com> wrote: > Take a look at the lines that are getting the error and check that all > characters are in the unicharset generated by training. > > The size of lstm-unicharset is different than the one generated by the > training text, note the message shown at beginning of training. > > Check github issues, one of the most recent ones re diff sizes of > unicharset and it's impact on training. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXmtdXgQtHghaZWB%3D45L6kfmuKcDrecSFZmyG7rHPYfzQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.