[tesseract-ocr] Scripts to generate langdata

Sim Tov Mon, 16 Aug 2021 08:49:59 -0700

Hello,

I'm learning how to train tesseract for a new script and one of the stages
is generating langdata.


I saw the examples here:

https://raw.githubusercontent.com/tesseract-ocr/langdata_lstm

I can provide lang/lang.training_text and lang/lang.wordlist

What is the purpose of the rest of the files? E.g. lang.unicharambigs and
lang.singles_text? Do they depend on lang/lang.training_text and
lang/lang.wordlist and if yes - how do I generate them?

Thank you!

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CA%2BX_a%2BzRyVS7Ne%3Dxnrjz%3DC1L_iHF4%3Dc3NvwvLZ%2Be6Dec1O3eVw%40mail.gmail.com.

[tesseract-ocr] Scripts to generate langdata

Reply via email to