I have uploaded the results of various trainings for IAST (with diacritics)
and Devanagari for Sanskrit at
https://github.com/Shreeshrii/tess5training-sanskrit-iast/tree/main/tessdata/best
. The traineddata files and the corresponding lstm-unicharset has been
uploaded there.

The training has been done mostly with line images of synthetic training
data in various fonts. On evaluation datasets of synthetic training data,
not seen during training, I get a CER of 2-3%. I am curious to know how
well these perform with real life images.

I will appreciate if those who are testing can send me a few of their test
images along with the ground truth text.





<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Mon, Sep 28, 2020 at 12:19 PM Shree Devi Kumar <shreesh...@gmail.com>
wrote:

> I am currently running a training run based on synthetic training data for
> Sanskrit to support both Devanagari script with vedic accents as well as
> iAST (Roman with diacritics support). I will share the traineddata for you
> and others who are interested to test how well it works with real life
> images.
>
> On Mon, Sep 28, 2020, 10:43 shreyansh dwivedi <advocates...@gmail.com>
> wrote:
>
>> Hello everyone,
>> I want to train some diacritical which are not present in latin.trained
>> model, apart from latin i used vietnamese and latvian trained model but the
>> some of the diacriticals are missed in those models too, some of missed
>> characters are mentioned below which i need to recognise.
>> ṭ
>> Ṭ
>> ṅ
>> ṭh
>> ḍ
>> ḍh
>> ṇ
>> ṃ
>> ṣ
>> Ḥ
>> ḥ
>> I want to train the above diacritical to recognise the characters in the
>> text image, through the tesseract engine.
>> Any help would be appreciated and from the scratch would be a great way
>> to understand.
>> Thank you!
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to tesseract-ocr+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAMREWd6R%2Bec5r%3D77%2BRWGM7PUKZPqqJT%2BkNX6r9zwijvW5sxykQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAMREWd6R%2Bec5r%3D77%2BRWGM7PUKZPqqJT%2BkNX6r9zwijvW5sxykQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWRgU8JFRm2RP3ndzrsVVeS%3DFF%2BDg8w3LTkjR_kv9eU7g%40mail.gmail.com.

Reply via email to