Improving recognition

nico Sat, 01 Dec 2012 17:37:24 -0800

Hello,

I am testing tesseract on an image of a supermarket bill. The bill is in 
French with some article names in English. The command used is


tesseract.exe" bill.jpg bjfra -l fra -psm 6 

with the input in jpg format. The result is half gibberish. (Link to input 
& output 
files https://docs.google.com/folder/d/0B_GVrSvgtwQrU0F6a3lxUzJUc0U/edit)


I was wondering if there are some specific ways to improve the quality of 
the output.

Is there a visual tool that can be used to help/train tesseract in the 
recognition process? I tried segdemo but no window showed up.

C:\>"\Program Files (x86)\Tesseract-OCR\tesseract.exe" bill.jpg eng segdemo 
inter
Tesseract Open Source OCR Engine v3.02 with Leptonica
Starting java -Xms512m -Xmx1024m -Djava.library.path="C:\Program Files 
(x86)\Tesseract-OCR\java" -cp "C:\Program Files 
(x86)\Tesseract-OCR\java"/ScrollView.jar;"C:\Program Files 
(x86)\Tesseract-OCR\java"/piccolo-1.2.jar;"C:\Program Files 
(x86)\Tesseract-OCR\java"/piccolox-1.2.jar com.google.scrollview.ScrollView

Would the experts have any suggestions?

Thanks

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Improving recognition

Reply via email to