Please try with iast.traineddata model for tesseract.4.0.0-beta posted at https://github.com/Shreeshrii/tessdata_sanskrit
On Thu, Jun 21, 2018 at 11:38 PM yajva <nsvnarasi...@gmail.com> wrote: > one more correction. > > > On Thursday, June 21, 2018 at 11:34:00 PM UTC+5:30, yajva wrote: >> >> done >> >> On Wednesday, June 20, 2018 at 9:05:01 PM UTC+5:30, shree wrote: >>> >>> I am attaching the OCRed text. Please correct it so that I can use as >>> groundtruth for further training and testing. >>> >>> On Wed, Jun 20, 2018 at 3:15 PM Shree Devi Kumar <shree...@gmail.com> >>> wrote: >>> >>>> I had done a training for sanskrit for both devanagari and IAST but it >>>> does not include cedilla for Sh >>>> >>>> I will add it and let you know. >>>> >>>> On Wed 20 Jun, 2018, 1:17 AM yajva, <nsvnar...@gmail.com> wrote: >>>> >>>>> I have tried Google OCR for recognizing Sanskrit text in Roman with >>>>> diacritics (IAST). It recognizes above macron but not dots below also >>>>> joining grave and accent. Is there any traineddata available for tesseract >>>>> that can do this with good accuracy ? Attached a sample page that I am >>>>> interested in. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesseract-oc...@googlegroups.com. >>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/aef0797b-8df3-4db7-9a3b-02f62d2e5a28%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/aef0797b-8df3-4db7-9a3b-02f62d2e5a28%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/a7bdf637-7f17-4eb3-8fa8-297018633bfa%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/a7bdf637-7f17-4eb3-8fa8-297018633bfa%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX5A3AamK0JGjmBfxpG8FhoAoODvTkiPZYciX2WMCqp0g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.