Hello,

We are working on a project for underprivileged kids, we need to build an 
OCR for the Malayalam language.

We downloaded some online training data available for the language 
Malayalam,  the current accuracy is around 60%, we found that few special 
characters in the language are not picked up by the training data properly.

So we wanted to fine-tune the current training data, we did some research 
and then downloaded Jtessbox editor for creating training data but we 
couldn't edit the incorrect character.

then we tried the QT-Box editor, we were able to edit the incorrect letters 
but we couldn't generate the training data through the software 

Finally, we tried Cygwin with the command line to generate the custom data 
but we failed to combine the training data 

As this is for an NGO our company wants to close this project with the 
current achieved  60% accuracy, I really wish to complete this as the 
English translation is completely wrong can someone please guide us on how 
to train the data

Any help would be much appreciated
Thanks in advance  

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/84a6fc1f-300a-4aac-85b8-99c47a7d88f4n%40googlegroups.com.

Reply via email to