Dear Quan , What is the Unicode support status in VietOCR and jTessBox Editor . I feel bit uneasy to work with jTess when words like கோ கொ கா are to be handled .
This is my problem : while கோ cannot be at any cost be represented as ெ க ா , but I think Teseract is segmenting the first ெ and recognising it as எ . These form major part of my problem in Jtess as well as Tesseract . For English its different case where single character represents a letter and I have minimal idea about Vietnamese language . Not sure if you too are facing / faced the same issue and resolved it . Could you please enlighten me on the same if its jTess issue / VietOCR or I must start an new thread . -Sibi On Friday, March 6, 2015 at 7:50:14 AM UTC+5:30, Quan Nguyen wrote: > > A Java/.NET GUI frontend for Tesseract OCR engine. The releases include > the following improvements: > > - Add Split TIFF function > - Add thumbnail bar for ease of page navigation > - Display useful info in statusbar > - Update links to OpenOffice dictionaries > - Add support for reading specific configs files for setting control > parameters > - Improved 64-bit support > > http://vietocr.sf.net > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/73e706a0-2037-45a0-bae4-22425b50eb83%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

