On Sep 25, 3:02 pm, Paul Hankin <[EMAIL PROTECTED]> wrote:
> Googling for 'pdf to text python' and following the first link 
> giveshttp://pybrary.net/pyPdf/

Doesn't work that well, I've tried it, you should too... the author
even admits this:

extractText() [#]

    Locate all text drawing commands, in the order they are provided
in the content stream, and extract the text. This works well for some
PDF files, but poorly for others, depending on the generator used.
This will be refined in the future. Do not rely on the order of text
coming out of this function, as it will change if this function is
made more sophisticated. - source 
http://pybrary.net/pyPdf/pythondoc-pyPdf.pdf.html

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to