Sorry, there seems to be some regression in the file posted on github. I will upload again later.
On Fri, Jun 22, 2018 at 7:56 PM Shree Devi Kumar <shreesh...@gmail.com> wrote: > Please try with iast.traineddata model for tesseract.4.0.0-beta posted at > https://github.com/Shreeshrii/tessdata_sanskrit > > On Thu, Jun 21, 2018 at 11:38 PM yajva <nsvnarasi...@gmail.com> wrote: > >> one more correction. >> >> >> On Thursday, June 21, 2018 at 11:34:00 PM UTC+5:30, yajva wrote: >>> >>> done >>> >>> On Wednesday, June 20, 2018 at 9:05:01 PM UTC+5:30, shree wrote: >>>> >>>> I am attaching the OCRed text. Please correct it so that I can use as >>>> groundtruth for further training and testing. >>>> >>>> On Wed, Jun 20, 2018 at 3:15 PM Shree Devi Kumar <shree...@gmail.com> >>>> wrote: >>>> >>>>> I had done a training for sanskrit for both devanagari and IAST but it >>>>> does not include cedilla for Sh >>>>> >>>>> I will add it and let you know. >>>>> >>>>> On Wed 20 Jun, 2018, 1:17 AM yajva, <nsvnar...@gmail.com> wrote: >>>>> >>>>>> I have tried Google OCR for recognizing Sanskrit text in Roman with >>>>>> diacritics (IAST). It recognizes above macron but not dots below also >>>>>> joining grave and accent. Is there any traineddata available for >>>>>> tesseract >>>>>> that can do this with good accuracy ? Attached a sample page that I am >>>>>> interested in. >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/aef0797b-8df3-4db7-9a3b-02f62d2e5a28%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/aef0797b-8df3-4db7-9a3b-02f62d2e5a28%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>> >>>> -- >>>> >>>> ____________________________________________________________ >>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To post to this group, send email to tesseract-ocr@googlegroups.com. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/a7bdf637-7f17-4eb3-8fa8-297018633bfa%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/a7bdf637-7f17-4eb3-8fa8-297018633bfa%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXmGb9rjEo31q19%2BD1ArqVqm0LiWGFt8O8NSzos1KXe%2Bg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.