Hi, There is no Sinhalese traineddata file, no one has published trained data for tesseract yet. There is a sinhala ocr <http://ucsc.cmb.ac.lk/ltrl/?page=panl10n_p1&lang=en> developed by UCSC, but their traineddata file is not accessible. You can find Sinhalese traineddate file from this sinhala ocr <https://code.google.com/p/sin-ocr/> but it is lack of accuracy. I am looking forward to train tesseract for Sinhalese (especially for the letters in old newspapers which don't have exact fonts). I'll post here if I succeed with training Sinhalese. Anyone has knowledge about training tesseract for Sinhalese in high accuracy please comment here or share training files.
Regards. On Tuesday, February 24, 2015 at 8:13:01 AM UTC+5:30, Thilina Mendis wrote: > > hi guys anyone know whether there is sinhala.traineddata file... for > sinhala fonts? i have to use this. please do share if yall have any files > regarding sinhala fonts :) > > Thanks > Thilina Mendis > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/223474a4-fbcf-494b-99a3-ba5d36ccc0e1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

