You can use Layout Parser in Python. On Fri, 3 Feb 2023 at 11:53 AM, 'Zisha' via tesseract-ocr < tesseract-ocr@googlegroups.com> wrote:
> I want to OCR documents containing images, figures, etc. Is there a way to > detect non-text items and extract them to png, and then OCR the rest of the > document? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/5bd32eef-28ea-4da0-a16f-dd0e1c3a4a70n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/5bd32eef-28ea-4da0-a16f-dd0e1c3a4a70n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAHW33tazUB1zJ%2Bf%3DAno6t8XV3OQCN%3DjD7vAw%3D9fB3hrnnVW18Q%40mail.gmail.com.