Also, is there a maximum number of lines for a training_text file to not exceed? What is the ideal number of lines for such a file to have when training Tesseract from scratch?
Thanks. On Sunday, 17 January 2021 at 12:14:14 UTC+1 Adriana Camilleri wrote: > Hi all, > > By any chance, is there any training_text available which contains all of > the characters within the Latin.unicharset file ( > https://github.com/tesseract-ocr/langdata_lstm/blob/master/script/Latin/Latin.unicharset) > > please? > > If not, are there any tools available to create the training_text file > with the desired charset? > > Thanks, > > Adriana > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7ecd3f64-c1e5-4526-968b-a23fd77f2438n%40googlegroups.com.