I am using the API to read data from an image. I have created training files for the fonts I process and I pre-process the image to deskew and clean it. When I read entirely numeric data it reads perfectly e.g. 123456. When I read entirely alphabetic data it reads perfectly e.g. ABCDEFGH. The problem arises when I try to read text where the two are combined e.f. 12ABC3456. In this case, there are lots of errors (B and 8 mixed up for example). I have tried setting load_system_dawg and load_freq_dawg to be false but that did not help. Are there any other configuration changes I can make to help?
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c9f291a5-0051-4d55-9a89-c5870838f49d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.