Re: help on converting pdf to text

2005-11-03 Thread Octavian Rasnita
I have seen a program named pdftotex that can extract the text from .pdf files, and that program can be used in a perl program for extracting the text from more .pdf files. Search with Google for it. I have seen that it can extract the text even from some pdf files that have a copy protection set,

RE: help on converting pdf to text

2005-11-03 Thread Wagner, David --- Senior Programmer Analyst --- WGO
Stephen York wrote: > First off, realise that a pdf isn't just a marked up text document. > It's a wrapper for images and text, movies and many other formats. > > If you have a text pdf, then the text is a postscript object > catalogued somewhere within the pdf. > I've never done this in perl, but

Re: help on converting pdf to text

2005-11-03 Thread Stephen York
First off, realise that a pdf isn't just a marked up text document. It's a wrapper for images and text, movies and many other formats. If you have a text pdf, then the text is a postscript object catalogued somewhere within the pdf. I've never done this in perl, but there are many commercial uti