Hi Des, I am attempting to walk the same path you just walked and was hoping you could provide me with information on where to start. I want to train / create a new language in tesseract that would recognize texts of that language. How do i create the files you mentioned above? Is there a central wiki with all the info i need to get started? What were the biggest challenges you faced and in your opinion is it feasible to attempt to create a new language?
Thank you for your help On Sunday, September 10, 2023 at 2:49:15 p.m. UTC-2:30 desal...@gmail.com wrote: > I am trying to train a new language. I have prepared the all the necessary > files as per the manual. I have also combined them to a trained data file > using the *combine_lang_model command. * > > - I also have my training files such as the text files, box files and > .lsmf files inside oro-ground-truth folder. > > > But, I am having trouble to proceed from there. All the instructions for > training from scratch talk about using tesstrain.sh., which the manual > calls unsupported and outdated. > > - What should I do? Can you guys help me please? > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/78655442-7c94-4404-b609-ba5deaf345aen%40googlegroups.com.