I have already used tesseract 4.0 version for training on hand written
digits.
The steps are as follows:
1.The best way to do is use some handwriten fonts from Google or any where
else.
2.use the "tesstrain.sh" script to generate the starter trained data using
the text corpus containing only 0-9 digits in a random function , create
such a text corpus and generate the starter trained .
3. Use the starter trained data to generate final traineed data after lstm
training


If you want a detailed description, I can supply you with a complete
documentation of steps.

Chandra Churh Chatterjee


On Tue, Jul 17, 2018, 8:43 PM Ramakant Kushwaha <ramakant.sing...@gmail.com>
wrote:

> *Hi,*
>
> *Recently I trying to retrain Tesseract 4.0 for recognising handwritten
> digits. I am following official page but finding it very difficult. It
> would be great if someone can elaborate below steps*
>
> - Prepare training text.
> <https://github.com/tesseract-ocr/tesseract/issues/654#issuecomment-274574951>(I
> am using jTessBoxEditor for creating box files )
> - Render text to image + box file. (Or create hand-made box files for
> existing image data.)
> - Make unicharset file. (Can be partially specified, ie created
> manually). (Do not how to do this)
> - Make a starter traineddata from the unicharset and optional dictionary
> data.
> <https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#creating-starter-traineddata>
> - Run tesseract to process image + box file to make training data set.
> - Run training on training data set.
> - Combine data files.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/97e29010-f602-42e9-b3b8-121fb151a49e%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/97e29010-f602-42e9-b3b8-121fb151a49e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAD_EDkaz3cM5UOgBEA1KXKdfARj_JTdtW%3DC-B4ffBr7XL4NvRw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to