Re: searching pdf files for certain info

2005-02-24 Thread Follower
rbt <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > Not really a Python question... but here goes: Is there a way to read > the content of a PDF file and decode it with Python? I'd like to read > PDF's, decode them, and then search the data for certain strings. I've had succes

Re: searching pdf files for certain info

2005-02-22 Thread Tom Willis
Ah that makes sense. I only see the behavior in pdftotext. ps2ascii doesn't give me the layout , which for my purposes, I certainly need. Thanks for the info, Looks like I'll keep searching for that silver bullet.:( On Tue, 22 Feb 2005 20:07:50 -0500, rbt <[EMAIL PROTECTED]> wrote: > Tom Willis

Re: searching pdf files for certain info

2005-02-22 Thread rbt
Tom Willis wrote: Well sporadic spaces in strings would cause problems would it not? an example The String: "Patient Face Sheet"--->pdftotext--->"P a tie n t Face Sheet" I'm just curious if you see anything like that, since I really have no clue about ps or pdf etc...but I have a strong desire

Re: searching pdf files for certain info

2005-02-22 Thread Tom Willis
Well sporadic spaces in strings would cause problems would it not? an example The String: "Patient Face Sheet"--->pdftotext--->"P a tie n t Face Sheet" I'm just curious if you see anything like that, since I really have no clue about ps or pdf etc...but I have a strong desire to replace a r

Re: searching pdf files for certain info

2005-02-22 Thread Kartic
rbt said the following on 2/22/2005 8:53 AM: Not really a Python question... but here goes: Is there a way to read the content of a PDF file and decode it with Python? I'd like to read PDF's, decode them, and then search the data for certain strings. Thanks, rbt Hi, Try pdftotext which is part o

Re: searching pdf files for certain info

2005-02-22 Thread rbt
Tom Willis wrote: I tried that for something not python related and I was getting sporadic spaces everywhere. I am assuming this is not the case in your experience? On Tue, 22 Feb 2005 10:45:09 -0500, rbt <[EMAIL PROTECTED]> wrote: Andreas Lobinger wrote: Aloha, rbt wrote: Thanks guys... what if I

Re: searching pdf files for certain info

2005-02-22 Thread Tom Willis
I tried that for something not python related and I was getting sporadic spaces everywhere. I am assuming this is not the case in your experience? On Tue, 22 Feb 2005 10:45:09 -0500, rbt <[EMAIL PROTECTED]> wrote: > Andreas Lobinger wrote: > > Aloha, > > > > rbt wrote: > > > >> Thanks guys... wh

Re: searching pdf files for certain info

2005-02-22 Thread rbt
Andreas Lobinger wrote: Aloha, rbt wrote: Thanks guys... what if I convert it to PS via printing it to a file or something? Would that make it easier to work with? Not really... The classical PS Drivers (f.e. Acroread4-Unix print-> ps) simply define the pdf graphics and text operators as PS comma

Re: searching pdf files for certain info

2005-02-22 Thread Andreas Lobinger
Aloha, rbt wrote: Thanks guys... what if I convert it to PS via printing it to a file or something? Would that make it easier to work with? Not really... The classical PS Drivers (f.e. Acroread4-Unix print-> ps) simply define the pdf graphics and text operators as PS commands and copy the pdf cont

Re: searching pdf files for certain info

2005-02-22 Thread rbt
Andreas Lobinger wrote: Aloha, rbt wrote: Not really a Python question... but here goes: Is there a way to read the content of a PDF file and decode it with Python? I'd like to read PDF's, decode them, and then search the data for certain strings. First of all, http://groups.google.de/groups?sel

Re: searching pdf files for certain info

2005-02-22 Thread Andreas Lobinger
Aloha, rbt wrote: Not really a Python question... but here goes: Is there a way to read the content of a PDF file and decode it with Python? I'd like to read PDF's, decode them, and then search the data for certain strings. First of all, http://groups.google.de/groups?selm=400CF2E3.29506EAE%40net

Re: searching pdf files for certain info

2005-02-22 Thread Diez B. Roggisch
rbt wrote: > Not really a Python question... but here goes: Is there a way to read > the content of a PDF file and decode it with Python? I'd like to read > PDF's, decode them, and then search the data for certain strings. There is a commercial tool pdflib availabla, that might help. It has a fr

searching pdf files for certain info

2005-02-22 Thread rbt
Not really a Python question... but here goes: Is there a way to read the content of a PDF file and decode it with Python? I'd like to read PDF's, decode them, and then search the data for certain strings. Thanks, rbt -- http://mail.python.org/mailman/listinfo/python-list