If anyone is following this thread and are using OCR-D, I had to modify the .py file because I kept getting a Unicode error, just add these lines to the file:
import sys reload(sys) sys.setdefaultencoding('utf-8') On Tuesday, July 24, 2018 at 4:41:45 PM UTC-5, Emiliano Isaza Villamizar wrote: > > It worked maybe I was using another *eng.traineddata. *Thank you for your > time Shree and Lorenzo > > kind regards, > Emiliano > > On Tuesday, July 24, 2018 at 11:40:34 AM UTC-5, shree wrote: >> >> * --continue_from >>>> >>>> /home/tulipan1637/Documents/Emiliano/OCR/OCRtraining/ocrd-train/tessdata/eng.lstm >>>> >>>> \* >>>> * --old_traineddata >>>> /home/tulipan1637/Documents/Emiliano/OCR/OCRtraining/ocrd-train/tessdata/ >>>> \* >>>> >>> >> Use eng.traineddata from tessdata_best >> https://github.com/tesseract-ocr/tessdata_best >> >> and extract the lstm file from it. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/49adbbe0-b428-44c0-9acd-b6cdca444288%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.