Hi All,

I am trying to upgrade the software versions of an inhouse text extraction 
application developed with Python, tesserocr python module and tesseract 
OCR software as below:



   - Existing software versions (Outdated softwares) : Python (v3.6.5) + 
   tesserocr (v2.4.0) + tesseract OCR (v4)
   - Target software versions   (Latest softwares)   : Python (v3.10.7) + 
   tesserocr (v2.5.2) + tesseract OCR (v5)


However I get different results from same set of softwares with different 
versions (as above) in terms of bounding box cordinates, text extraction 
results (minor changes), and other numerical metadata while calling the 
GetHOCRText method.

I need to get exact same extraction result in terms of metadata 
(ex.-bounding boxes) as I have some dependencies post the text extraction 
hence result needs to be same for metadata with the upgraded softwares.

Could you please advise ? 

Regards,
Prashant Sharma

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/59de7622-bb9d-4aa2-8b86-686b3d63f639n%40googlegroups.com.

Reply via email to