Re: [tesseract-ocr] extract paragraphs from the scanned pdf

Zdenko Podobny Sat, 13 Mar 2021 03:45:59 -0800

If you need help please provide an example of an input document, what you
already did/code you have, what is expected output etc.
Otherwise forum users will just consider your post as a statement and
nobody will care.


Zdenko


so 13. 3. 2021 o 9:50 Ajeet Ojha <[email protected]> napísal(a):

> Hi All,  I need to extract paragrapsh from the scanned pdf. can you please
> advise on how to do it using pytesseract.  the images are very different ,
> have various formats and typically every page so the image  has 3-10
> parargraphs.  my goal is to somehow draw the bounding boxes and then
> extract the text from each of the prargraph.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/ea047a7b-477d-492d-9813-f0ec56c0c065n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/ea047a7b-477d-492d-9813-f0ec56c0c065n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8we5Q8A-UBrTLyirXUBjYpp_Xo58Aj06BivRaTTaG_yrA%40mail.gmail.com.

Re: [tesseract-ocr] extract paragraphs from the scanned pdf

Reply via email to