[tesseract-ocr] Re: How to start from scratch (new language) in Tesseract 5

Jephthah Anga Thu, 16 Nov 2023 07:39:37 -0800

Hi Des,

I am attempting to walk the same path you just walked and was hoping you 
could provide me with information on where to start. I want to train / 
create a new language in tesseract that would recognize texts of that 
language. How do i create the files you mentioned above? Is there a central 
wiki with all the info i need to get started? What were the biggest 
challenges you faced and in your opinion is it feasible to attempt to 
create a new language?


Thank you for your help

On Sunday, September 10, 2023 at 2:49:15 p.m. UTC-2:30 desal...@gmail.com 
wrote:

> I am trying to train a new language. I have prepared the all the necessary 
> files as per the manual. I have also combined them to a trained data file 
> using the *combine_lang_model command. *
>
> - I also have my training files such as the text files, box files and 
> .lsmf files inside oro-ground-truth folder. 
>
>
> But, I am having trouble to proceed from there. All the instructions for 
> training from scratch talk about using tesstrain.sh., which the manual 
> calls unsupported and outdated. 
>
> - What should I do? Can you guys help me please?
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/78655442-7c94-4404-b609-ba5deaf345aen%40googlegroups.com.

[tesseract-ocr] Re: How to start from scratch (new language) in Tesseract 5

Reply via email to