En Thu, 26 Mar 2009 18:31:31 -0300, M Kumar <tomanis...@gmail.com>
escribió:
I need to read pdf files and extract data from it, is there any way to
do it
through python.
If you are interested in the text, I'd use ghostscript pdf2text (you may
invoke it from inside python).
Actually extracting text from a PDF is rather difficult. It's a
"presentation" format (or "display" format); every word in the document
might be absolutely positioned, there is no paragraph structure you can
rely on.
--
Gabriel Genellina
--
http://mail.python.org/mailman/listinfo/python-list