Re: [tesseract-ocr] Tessarct changed behaviour inside docker

2022-09-16 Thread vc Jayan
Hi This happens normally. The one properly working in local machine may not work as expected when dockerise. Please check following 1.Tesseract version you build in Docker file is same as in local machine. 2.If the input is pdf, try changing the version of pdf2image library in requirement file. 3.

Re: [tesseract-ocr] Tessarct changed behaviour inside docker

2022-09-16 Thread Gabriel Sousa
Thank you so much for the reply! It really helped me to know which path to take! I have already taken some of those steps, but now I know for a fact that I'm not crazy! hahaha 1 - It's a different version, BUT my coworker is running the same version of the container and get's the same results I

Re: [tesseract-ocr] Tessarct changed behaviour inside docker

2022-09-16 Thread vc Jayan
Hi, There are some image preprocessing you can attempt 1. Binary convert (adjust the threshold and its parameters) 2. Image resizing (OCR works best for pixels =>300 3. Dilation and Erosion - Adjust the text boundaries sizes so that OCR can read it better 4. Adjust the parameters 'oem' and 'psm'