On Sun, May 26, 2013 at 3:36 PM, Renard Wellnitz <[email protected]>wrote:
> Hi, > > i did the merge and also updated my repo with build instructions. > > changes: > > - extended ETEXT_DESC with PROGRESS_FUNC field. So users of the api > can register a callback function to get notified of progress percentage as > well as word bounding boxes. > - (Most people i have shown my app really liked how it highlighted > the current word when doing the ocr) > - changed the percentage progress values to start with 0% instead of > 30% > - added row attributes to hocr output so that i can make more straight > lines when creating the pdf files > > > Cheers > Renard > Try to to test/implement you patch to current tesseract code, and I face some questions related to hOCR standard[1]: - standard defines *x_font* *s* for OCR-engine specific font names. You put to hOCR output font='15'. What is purpose of it? - standard defines *x_fsize n* for OCR-engine specific font size. You used size. Should I change it to x_fsize (it could be dangerous, because there could be different font with different size in one line)? - standard does not define descenders and ascenders. Any suggestion (from community) better way how to keep hOCR standard? [1] https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview# Zdenko > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

