I'm going to assume you mean "text" here and not "test".

For recognizing text in images I've used the tesseract project with pretty good 
success. For more information about this you can Google OCR or optical 
character recognition.

For parsing text in PDF, it depends on how the text is encoded in the PDF. You 
can tell by trying to copy and paste the text manually when the PDF is open in 
your PDF reader. If you can't copy and paste the text that means that it's 
probably embedded in an image inside the PDF. In that case use the tesseract 
method recommended above.

If you can copy and paste a text, that means that it's actual text in the PDF 
file itself. In that case there are a few different PDF parsing libraries and 
Python that you can try to use to grab the text.

On April 3, 2021 12:54:32 AM CDT, Aniket Gadge <gadgeaniketra...@gmail.com> 
wrote:
>    How to read the test from image and pdf.
>
>-- 
>You received this message because you are subscribed to the Google
>Groups "Django users" group.
>To unsubscribe from this group and stop receiving emails from it, send
>an email to django-users+unsubscr...@googlegroups.com.
>To view this discussion on the web visit
>https://groups.google.com/d/msgid/django-users/CAAAecFt_Tr_4V-NqNY%3D%3DYLML0zjOGD31UOLWezw5DEJPb18Jig%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/AFAE5903-5200-410F-8EBE-673C1E4DD732%40fattuba.com.

Reply via email to