I am currently running a training run based on synthetic training data for
Sanskrit to support both Devanagari script with vedic accents as well as
iAST (Roman with diacritics support). I will share the traineddata for you
and others who are interested to test how well it works with real life
images.

On Mon, Sep 28, 2020, 10:43 shreyansh dwivedi <advocates...@gmail.com>
wrote:

> Hello everyone,
> I want to train some diacritical which are not present in latin.trained
> model, apart from latin i used vietnamese and latvian trained model but the
> some of the diacriticals are missed in those models too, some of missed
> characters are mentioned below which i need to recognise.
> ṭ
> Ṭ
> ṅ
> ṭh
> ḍ
> ḍh
> ṇ
> ṃ
> ṣ
> Ḥ
> ḥ
> I want to train the above diacritical to recognise the characters in the
> text image, through the tesseract engine.
> Any help would be appreciated and from the scratch would be a great way to
> understand.
> Thank you!
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAMREWd6R%2Bec5r%3D77%2BRWGM7PUKZPqqJT%2BkNX6r9zwijvW5sxykQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAMREWd6R%2Bec5r%3D77%2BRWGM7PUKZPqqJT%2BkNX6r9zwijvW5sxykQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW7TbFaTCNbsSQBfVw8L%2BHf0AXOC-iJPtg4LG4sg9vPDw%40mail.gmail.com.

Reply via email to