Re: Read and extract text from pdf

2006-04-21 Thread Jim
There is a pdftotext executable, at least on Linux. -- http://mail.python.org/mailman/listinfo/python-list

Re: Read and extract text from pdf

2006-04-21 Thread avishay
You can use Ghostscript for that purpose. Look at ps2ascii script (or batch file) in the Ghostscript distribution. You can either call Ghostscript from command line or use its DLL (don't know if Python binding already exists...). The limitations the previous author has mentioned, however, still app

Re: Read and extract text from pdf

2006-04-21 Thread Rene Pijlman
Julien ARNOUX: >I have a problem :), I just want to extract text from pdf file with >python. There is differents libraries for that but it doesn't work... > >pyPdf and pdfTools, I don't know why but it doesn't works with some >pdf... Text can be represented in different ways in PDF: as tagged tex