Re: pdf to text

2007-01-29 Thread Steve Holden
tubby wrote: > Dieter Deyke wrote: >>> sout = os.popen('pdftotext "%s" - ' %f) > >> Your program above should read: >> >>sout = os.popen('pdftotext "%s" - ' % (f,)) > > What is the significance of doing it this way? It's actually just nit-picking - as long as you know f is never going to

Re: pdf to text

2007-01-29 Thread tubby
Dieter Deyke wrote: >> sout = os.popen('pdftotext "%s" - ' %f) > Your program above should read: > >sout = os.popen('pdftotext "%s" - ' % (f,)) What is the significance of doing it this way? -- http://mail.python.org/mailman/listinfo/python-list

Re: pdf to text

2007-01-25 Thread Dieter Deyke
tubby writes: > David Boddie wrote: > >> The pdftotext tool may do what you want: >> >> http://www.foolabs.com/xpdf/download.html >> >> Let us know how you get on with it. >> >> David > > Perhaps I'm just using pdftotext wrong? Here's how I was using it: > > f = filename > > try: > sout = os

Re: pdf to text

2007-01-25 Thread Lee Harr
> Perhaps I'm just using pdftotext wrong? Here's how I was using it: > > sout = os.popen('pdftotext "%s" - ' %f) If you are having trouble with popen (not unlikely) how about just writing to a temporary file and reading the text from there? I've used pdftotext several times in the past f

Re: pdf to text

2007-01-25 Thread tubby
David Boddie wrote: > The pdftotext tool may do what you want: > > http://www.foolabs.com/xpdf/download.html > > Let us know how you get on with it. > > David Perhaps I'm just using pdftotext wrong? Here's how I was using it: f = filename try: sout = os.popen('pdftotext "%s" - ' %f)

Re: pdf to text

2007-01-25 Thread tubby
David Boddie wrote: > The pdftotext tool may do what you want: > > http://www.foolabs.com/xpdf/download.html > > Let us know how you get on with it. I have used this tool. However, I need PDF read ability on Windows and Linux and in the future Macs. pdftotext works great on Linux, but poorly

Re: pdf to text

2007-01-25 Thread David Boddie
On Thursday 25 January 2007 22:05, tubby wrote: > I know this question comes up a lot, so here goes again. I want to read > text from a PDF file, run re searches on the text, etc. I do not care > about layout, fonts, borders, etc. I just want the text. I've been > reading Adobe's PDF Reference Gui

Re: pdf to text

2007-01-25 Thread Nils Oliver Kröger
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 have a look at the pdflib (www.pdflib.com). Their Text Extraction Toolkit might be what you are looking for, though I'm not sure whether you can use it detached from the pdflib itself. hth Nils tubby schrieb: > I know this question comes up a lot, s

pdf to text

2007-01-25 Thread tubby
I know this question comes up a lot, so here goes again. I want to read text from a PDF file, run re searches on the text, etc. I do not care about layout, fonts, borders, etc. I just want the text. I've been reading Adobe's PDF Reference Guide and I'm beginning to develop a better understandin

Re: PDF to text script

2006-11-10 Thread Nick Vatamaniuc
Vyz wrote: > I am looking for a PDF to text script. I am working with multibyte > language PDFs on Windows Xp. I need to batch convert them to text and > feed into an encoding converter program > > Thanks for any help in this regard Multibyte languages are not easy. I do text extr

Re: PDF to text script

2006-11-10 Thread Cameron Laird
In article <[EMAIL PROTECTED]>, Vyz <[EMAIL PROTECTED]> wrote: >I am looking for a PDF to text script. I am working with multibyte >language PDFs on Windows Xp. I need to batch convert them to text and >feed into an encoding converter program > >Thanks for any

PDF to text script

2006-11-10 Thread Vyz
I am looking for a PDF to text script. I am working with multibyte language PDFs on Windows Xp. I need to batch convert them to text and feed into an encoding converter program Thanks for any help in this regard -- http://mail.python.org/mailman/listinfo/python-list