Please see https://tesseract-ocr.github.io/tessdoc/Data-Files.html
Also the readme files in the three repos https://github.com/tesseract-ocr/tessdata_fast On Thu, Jan 28, 2021, 03:20 Peter Kronenberg <peter.kronenb...@torch.ai> wrote: > Hi, can someone help with these questions? Just trying to understand > better how the language models are used and what is the difference between > them. > > > > Thanks > > Peter > > > > *From:* tesseract-ocr@googlegroups.com <tesseract-ocr@googlegroups.com> *On > Behalf Of *Peter Kronenberg > *Sent:* Thursday, January 21, 2021 12:59 PM > *To:* tesseract-ocr@googlegroups.com > *Subject:* {EXTERNAL}[tesseract-ocr] Installing tessdata > > > > This email was sent from outside your organisation, yet is displaying the > name of someone from your organisation. This often happens in phishing > attempts. Please only interact with this email if you know its source and > that the content is safe. > > > > CAUTION: This email originated from outside of the organization. DO NOT > click links or open attachments unless you recognize the sender and know > the content is safe. > > I see that the default tessdata just has English and OSD. I see all the > other data at https://github.com/tesseract-ocr/tessdata. Do I just copy > those to the same tessdata directory? The repo has a much larger version > of eng.traineddata than what comes with Tesseract. Can I just replace it? > > And what is the difference of the ones in the script directory? > > > > In the directory from the initial install, not only do I have > eng.traineddata, but there is also user-patterns, user-words and other > files. Do those files exist for the other languages as well? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268642993B65C83511CFAF88E7A19%40MN2PR20MB2686.namprd20.prod.outlook.com > <https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268642993B65C83511CFAF88E7A19%40MN2PR20MB2686.namprd20.prod.outlook.com?utm_medium=email&utm_source=footer> > . > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268647BB8BA42CE575E06764E7BB9%40MN2PR20MB2686.namprd20.prod.outlook.com > <https://groups.google.com/d/msgid/tesseract-ocr/MN2PR20MB268647BB8BA42CE575E06764E7BB9%40MN2PR20MB2686.namprd20.prod.outlook.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVdDOzm5GMF0jSvfw7vSpMqeDRH%3Db90Qza4L%2B3tMM5UWg%40mail.gmail.com.