The task you mention is called "The document layout segmentation" or "Document layout analysis"( https://en.wikipedia.org/wiki/Document_layout_analysis)
As mentioned Muneeb, you can try https://layout-parser.github.io/ and also https://github.com/qurator-spk/eynollah looks promising. I you would like to do custom training, have a look at https://towardsdatascience.com/object-detection-on-newspaper-images-using-yolov3-85acfa563080 More code/tools could be find via github topics: https://github.com/topics/document-layout-analysis Zdenko pi 3. 2. 2023 o 7:53 'Zisha' via tesseract-ocr < tesseract-ocr@googlegroups.com> napĂsal(a): > I want to OCR documents containing images, figures, etc. Is there a way to > detect non-text items and extract them to png, and then OCR the rest of the > document? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/5bd32eef-28ea-4da0-a16f-dd0e1c3a4a70n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/5bd32eef-28ea-4da0-a16f-dd0e1c3a4a70n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zw-iUuH-PB4G%3D7Zn4%2BkSGvCHrcLfri-RbuM0q2wT8mHg%40mail.gmail.com.