Hi, can someone help with these questions? Just trying to understand better how the language models are used and what is the difference between them.
Thanks Peter From: tesseract-ocr@googlegroups.com <tesseract-ocr@googlegroups.com> On Behalf Of Peter Kronenberg Sent: Thursday, January 21, 2021 12:59 PM To: tesseract-ocr@googlegroups.com Subject: {EXTERNAL}[tesseract-ocr] Installing tessdata This email was sent from outside your organisation, yet is displaying the name of someone from your organisation. This often happens in phishing attempts. Please only interact with this email if you know its source and that the content is safe. CAUTION: This email originated from outside of the organization. DO NOT click links or open attachments unless you recognize the sender and know the content is safe. I see that the default tessdata just has English and OSD. I see all the other data at https://github.com/tesseract-ocr/tessdata. Do I just copy those to the same tessdata directory? The repo has a much larger version of eng.traineddata than what comes with Tesseract. Can I just replace it? And what is the difference of the ones in the script directory? In the directory from the initial install, not only do I have eng.traineddata, but there is also user-patterns, user-words and other files. Do those files exist for the other languages as well? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com<mailto:tesseract-ocr+unsubscr...@googlegroups.com>. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268642993B65C83511CFAF88E7A19%40MN2PR20MB2686.namprd20.prod.outlook.com<https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268642993B65C83511CFAF88E7A19%40MN2PR20MB2686.namprd20.prod.outlook.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268647BB8BA42CE575E06764E7BB9%40MN2PR20MB2686.namprd20.prod.outlook.com.