[tesseract-ocr] How to extract non-text regions

2023-02-02 Thread 'Zisha' via tesseract-ocr
I want to OCR documents containing images, figures, etc. Is there a way to detect non-text items and extract them to png, and then OCR the rest of the document? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group an

Re: [tesseract-ocr] How to extract non-text regions

2023-02-02 Thread Muneeb Khurram
You can use Layout Parser in Python. On Fri, 3 Feb 2023 at 11:53 AM, 'Zisha' via tesseract-ocr < tesseract-ocr@googlegroups.com> wrote: > I want to OCR documents containing images, figures, etc. Is there a way to > detect non-text items and extract them to png, and then OCR the rest of the > docum