On Wednesday, May 14, 2014 1:17:46 PM UTC-5, Joe Aspara wrote: > > @Quan > Ok that with pam 0 OCR isn't performed but I'm expecting that when I run > "tesseract input_image output_text -l eng -psm 0" I'll get the analysis > response in the output_text file. With Tesseract 3.02 it isn't so :( >
I was speaking regarding Tess4J. You can get the information of interest through the API. > > @zdenop > So Tesseract v. 3.02 doesn't support this feature... I'll try 3.03 > version! Many thanks! > > Il giorno domenica 11 maggio 2014 13:53:45 UTC+2, Quan Nguyen ha scritto: >> >> With psm 0, Tesseract does not perform normal OCR function but analyzes >> layout; it produces such characteristics as Orientation, Writing Direction, >> and Textline Order. Check Tess4J unit tests for usage of OSD. >> >> On Sunday, May 11, 2014 5:48:39 AM UTC-5, Joe Aspara wrote: >>> >>> I'm struggling with the OSD function of Tesseract 3.02. >>> I tried the standalone version via command line and the Tess4J version >>> too, but I always obtain an error with different input types. >>> >>> I downloaded the osd.traineddata for version 3.01 (I guess no such file >>> still exist for v3.02) from here >>> https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.01.osd.tar.gz&can=2&q= >>> and I copied it properly in the TESSDATA folder >>> >>> Below my experiments: >>> >>> COMMAND LINE >>> tesseract input_image output_text -l eng -psm 0 >>> response: Error during processing. >>> >>> With psm = 1 it read text with very bad quality, with psm = 2 or 3 it >>> give my empty output. >>> >>> As far as I know only 0 and 1 values perform OSD! From the reference: >>> 0 = Orientation and script detection (OSD) only. >>> 1 = Automatic page segmentation with OSD. >>> >>> >>> TESS4J >>> Tesseract instance = Tesseract.getInstance(); >>> instance.setLanguage("ita"); >>> instance.setPageSegMode(TessPageSegMode.PSM_AUTO_OSD); >>> String result = instance.doOCR(myImage); >>> >>> result always is empty at the end >>> >>> To know the input orientation it's critical for my project but at now >>> I'm not able to find a way to accomplish this. >>> >>> I hope somebody can help me! Thanks in advance >>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/215c9225-624c-44ac-b646-74d28236e0b5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

