Re: [tesseract-ocr] bad quality!?

2021-12-29 Thread Cyrus Yip
I played around a bit and replacing all colours except for text colour and it works pretty well! The only thing is replacing colours with: im = im.convert("RGB") pixdata = im.load() for y in range(im.height): for x in range(im.width): if pixdata[x, y] != (51, 51, 51): pixd

Re: [tesseract-ocr] bad quality!?

2021-12-29 Thread Zdenko Podobny
IMO if the text is always in the same area, cropping and OCR just that area will be faster. Zdenko st 29. 12. 2021 o 18:58 Cyrus Yip napísal(a): > I played around a bit and replacing all colours except for text colour and > it works pretty well! > > The only thing is replacing colours with: >

Re: [tesseract-ocr] getting started, but no results

2021-12-29 Thread Zdenko Podobny
No. The problem is document layout and text detection. Tesseract is OCR engine - it can detect text but on images with simple layouts (books page, newspaper, etc). The input image is too complex, so for a good result you need to send to tesseract API only text region. If the input image is the sa

Re: [tesseract-ocr] bad quality!?

2021-12-29 Thread Cyrus Yip
but won't multiple ocr's and crops use a lot of time? On Wednesday, December 29, 2021 at 10:15:26 AM UTC-8 zdenop wrote: > IMO if the text is always in the same area, cropping and OCR just that > area will be faster. > > Zdenko > > > st 29. 12. 2021 o 18:58 Cyrus Yip napísal(a): > >> I played a