On 25 Mar 2001 20:25:47 -0800 impersonator of
[EMAIL PROTECTED] (David Robley) planted &I saw in
php.general:

>On Mon, 26 Mar 2001 13:20, Erick Papadakis wrote:
>> hi david,
>>
>> thanks for the note. ok, here is what i want to do. i
>> want my users to upload WORD, XLS, PPT and PDF files.
>> when they upload, i store these files in the temp
>> directory, grab the text from them, and then put it
>> into my database for later searching. i dont care
>> about the formatting, i only care about the text
>> because i need the keywords later for searching.
>>
>> can i run some sort of a parser on the server side
>> like the wvware.com's word parser and just call it
>> through php? i have not been able to figure out how to
>> do this using php.
>>
>> i would really appreciate any ideas and suggestions!
>>
>> thanks/erick
>
>OK - do you have the relevant parsers? There are specific (Unix) tools 
>available for PDF (pdftotext or acrobat reader) and Word; I mentioned 
>some of that in an earlier mail. For Excel you could probably use (Unix 
>again) just the Unix command strings to get the text - same might work 
>for Powerpoint but if some dipstick has created a graphics-only 
>presentation you won't get much that's useful.
>
>As I mentioned previously, you'll probably want to run the parser[s] from 
>PHP using one of the Program execution functions; exec, system or the 
>backtick operator should do what you want. How you capture the output 
>will of course depend on how the output is delivered - see the docs for 
>the particular parser.
>
>None of the above is probably relevant if your server is Windows.
>
Excuse me for entering, but I just successfully run pdftotext (the latest
binary) from php exec() on my personal Windows95. The only inconvinience
is that the capture of output, looks like, is not designed for this tool.
Have to put up with output in the file:-(

Regards,
--
LeoN     to  e-mail: cut  "auto_no." if present. 
(.±.)  ` to think - is to speak quietly,  to speak - is to think aloud`
 \~/     
My posted articles archive: http://leo.portland.co.uk/doc00.htm


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to