Re: Help with a Python coding question

Justin Peel Wed, 05 Jan 2011 17:17:52 -0800

On Wed, Jan 5, 2011 at 4:45 PM, Emile van Sebille <[email protected]> wrote:


> On 1/5/2011 3:12 PM [email protected] said...
>
>  I want to use Python to find all "\n" terminated
>> strings in a PDF file, ideally returning string
>> starting addresses.   Anyone willing to help?
>>
>
> pdflines = open(r'c:\shared\python_book_01.pdf').readlines()
> sps = [0]
> for ii in pdflines: sps.append(sps[-1]+len(ii))
>
> Emile
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
Bear in mind that pdf files often have compressed objects in them. If that
is the case, then I would recommend opening the pdf in binary mode and
figuring out how to deflate the correct objects before doing any searching.
PyPDF is a package that might help with this though it could use some
updating.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Help with a Python coding question

Reply via email to