Hello, I am testing tesseract on an image of a supermarket bill. The bill is in French with some article names in English. The command used is
tesseract.exe" bill.jpg bjfra -l fra -psm 6 with the input in jpg format. The result is half gibberish. (Link to input & output files https://docs.google.com/folder/d/0B_GVrSvgtwQrU0F6a3lxUzJUc0U/edit) I was wondering if there are some specific ways to improve the quality of the output. Is there a visual tool that can be used to help/train tesseract in the recognition process? I tried segdemo but no window showed up. C:\>"\Program Files (x86)\Tesseract-OCR\tesseract.exe" bill.jpg eng segdemo inter Tesseract Open Source OCR Engine v3.02 with Leptonica Starting java -Xms512m -Xmx1024m -Djava.library.path="C:\Program Files (x86)\Tesseract-OCR\java" -cp "C:\Program Files (x86)\Tesseract-OCR\java"/ScrollView.jar;"C:\Program Files (x86)\Tesseract-OCR\java"/piccolo-1.2.jar;"C:\Program Files (x86)\Tesseract-OCR\java"/piccolox-1.2.jar com.google.scrollview.ScrollView Would the experts have any suggestions? Thanks -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

