Re: Read PDF content

William Purcell Thu, 21 Aug 2008 06:49:05 -0700

Sorry, this last email was meant to be to the list.

On Thu, Aug 21, 2008 at 8:41 AM, William Purcell
<[EMAIL PROTECTED]>wrote:


> I have been trying to do the same thing. Here is something I came up with,
> although it's not completely dependent on Python. It requires pdftotext to
> be installed. If your on a linux box, I think it comes in xpdf-utils but I'm
> not comletely sure. Anyway, install pdftotext and then you could use this
> function:
>
> ----------------------------------------------------------------------------
> import os
>
> def readpdf(filepath):
>     cmd = 'pdftotext -layout %s -'%(filepath,)
>     lines=os.popen(cmd).readlines()
>     return lines
>
> ----------------------------------------------------------------------------
> I would like to find something totally Python, but this has worked for me
> in a pinch.
> -Bill
>
>
> On Thu, Aug 21, 2008 at 5:00 AM, AON LAZIO <[EMAIL PROTECTED]> wrote:
>
>> Hi, Guys.
>>       I am trying to extract the PDF file content(to get the specific
>> information) using python. I already tried pyPdf with no success.
>>       Anyone has suggestions?
>>       Thanks in advance.
>>
>> Aonlazio
>>
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
>
>

--
http://mail.python.org/mailman/listinfo/python-list

Re: Read PDF content

Reply via email to