Dear Quan , 

What is the Unicode support status in VietOCR and jTessBox Editor .
I feel bit uneasy to work with jTess when words like கோ கொ கா are to be 
handled . 

This is my problem : 

while கோ cannot be at any cost be represented as ெ    க    ா , but I think 
Teseract is segmenting the first ெ  and recognising it as எ . These form 
major part of my problem in Jtess as well as Tesseract . 

For English its different case where single character represents a letter 
and I have minimal idea about Vietnamese language . Not sure if you too are 
facing / faced the same issue and resolved it . Could you please enlighten 
me on the same if its jTess issue / VietOCR  or I must start an new thread 
. 

-Sibi



On Friday, March 6, 2015 at 7:50:14 AM UTC+5:30, Quan Nguyen wrote:
>
>   A Java/.NET GUI frontend for Tesseract OCR engine. The releases include 
> the following improvements:
>
>    - Add Split TIFF function
>    - Add thumbnail bar for ease of page navigation
>    - Display useful info in statusbar
>    - Update links to OpenOffice dictionaries
>    - Add support for reading specific configs files for setting control 
>    parameters
>    - Improved 64-bit support
>
> http://vietocr.sf.net
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/73e706a0-2037-45a0-bae4-22425b50eb83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to