spinergywmy wrote:
I have posted this question before and this time I found that it could be
pdfbox problem and this pdfbox I downloaded doesn't use the log4j.jar. To
index the app 2.13mb pdf file took me 17s and total time to upload a file is
18s.
Re: PFDBox.
I have a 2.5Mb test file that extracts in ~4 seconds. CPU is AMD athlon 3500+
at 2200MHz. While converting it creates 4.7MB file in temp directory
(pdfboxnnnnn.tmp).
Another 3.7MB test file takes over 20 seconds to extract and it creates 15MB
temp file.
It seems dependent on the data, but Cygwin's PDFtotext also has similar
performance profile over the two documents. I would revert to the PDFBox forum
to continue the discussion.
Antony
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]