On Sun, May 26, 2013 at 3:36 PM, Renard Wellnitz
<[email protected]>wrote:

> Hi,
>
> i did the merge and also updated my repo with build instructions.
>
> changes:
>
>    - extended ETEXT_DESC with PROGRESS_FUNC field. So users of the api
>    can register a callback function to get notified of progress percentage as
>    well as word bounding boxes.
>       - (Most people i have shown my app really liked how it highlighted
>       the current word when doing the ocr)
>    - changed the percentage progress values to start with 0% instead of
>    30%
>    - added row attributes to hocr output so that i can make more straight
>    lines when creating the pdf files
>
>
> Cheers
> Renard
>


Try to to test/implement you patch to current tesseract code, and I face
some questions related to hOCR standard[1]:

   - standard defines *x_font* *s* for OCR-engine specific font names. You
   put to hOCR output font='15'. What is purpose of it?
   - standard defines *x_fsize n* for OCR-engine specific font size.
   You used size. Should I change it to x_fsize (it could be dangerous,
   because there could be different font with different size in one line)?
   - standard does not define descenders and ascenders. Any suggestion
   (from community) better way how to keep hOCR standard?


[1]
https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview#

Zdenko



>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to