I think that you can extract pretty easily the header, like: Subject,
Creator, Author etc... But extracting values in a table may not that be so
easy as the objects creation in the file are dependent on the file history
and in addition the pdf file may be in a binary form.
Alain
On Fri, Aug 31, 2001 at 12:16:38PM -0300, Paul Meagher wrote:
> Wondering if anyone has tried to parse out a table of information from a
> PDF file?
>
> Is it a matter of opening the file, looping through its contents
> line-by-line looking for tags that demarcate table cell boundaries and
> extracting the relevant cell values?
>
> I figured if Google can index PDF content it must be possible to pull the
> content out something like one would an HTML file.
>
> Mostly wondering how much work might be involved and if there are any
> tricks that I should be aware of before I begin...
>
> Regards,
> Paul
>
>
>
>
>
>
>
>
> --
> PHP Windows Mailing List (http://www.php.net/)
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> To contact the list administrators, e-mail: [EMAIL PROTECTED]
--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]