Hi, Could someone help me understand why I am getting the following error when using tesstrain with the START_MODEL option? Failed to continue from: data/micr_ref/micr.lstm
>From my local tesstrain repo (cloned from https://github.com/tesseract-ocr/tesstrain), I have the following in my data directory: data ├── micr-ground-truth │ ├── micr-1.gt.txt │ ├── micr-1.tif │ ├── micr-2.gt.txt │ └── micr-2.tif └── micr_proto-ground-truth ├── micr.gt.txt └── micr.tif I am using what is in 'micr_proto-ground-truth' to build my proto model, which I then use as a START_MODEL for training the micr model from 'micr-ground-truth'. More specifically, I issued the following commands from my tesstrain repo: gmake tesseract-langdata gmake proto-model MODEL_NAME=micr_proto mkdir -p usr/share/tessdata cp data/micr_proto/micr_proto.traineddata usr/share/tessdata gmake training MODEL_NAME=micr START_MODEL=micr_proto The final command fails with the following error: * Failed to continue from: data/micr_proto/micr.lstm* * gmake: *** [Makefile:327: data/micr/checkpoints/micr_checkpoint] Error 1* Can anyone tell me what I am doing wrong? *Background Info* My ultimate goal is to train tesseract to OCR the MICR line from the bottom of check images with 99+% accuracy. For my test/training set, I have more than 20K tif check images which I have cropped and cleaned using opencv to include only the bottom portion which contains the MICR line. I also have the gt.txt file for each cropped image. I tried the mcr.traineddata (from https://github.com/BigPino67/Tesseract-MICR-OCR/blob/master/Tessdata/mcr.traineddata) with multiple PSM values, but the accuracy was very low. I also tried using tesstrain directly as follows with my entire training set in the data directory: qmake training MODEL_NAME=micr but the resulting micr.traineddata yielded even worse results. So now I am trying to build my proto model as described above using a single reference image, and then to use that as the START_MODEL for my training, but I am hitting the error I mentioned above. Is my approach incorrect? If yes, can you please direct me? I am not finding the documentation extremely clear, so I obviously may be doing something stupid. Thanks much for the help, Keith BTW, I am attaching the data.zip (contents of my data directory) in case someone wants to reproduce this. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/da620ad4-0686-4583-91a4-178bfd81b422n%40googlegroups.com.
<<attachment: data.zip>>