http://lucene.apache.org/java/docs/contributions.html lists several
PDF alternatives, but I can't speak to their performance. I am sure
if you googled PDF converters you could find a fair number of hits.
Perhaps w/ some more details about your app we might be able to find
a workaround. We often convert PDFs as a one time offline task
(which doesn't get around the fact that you need to do it) so that
they don't get in the way when indexing (and reindexing), but we
generally deal w/ fixed collections, which may not be your case.
-Grant
On Nov 30, 2006, at 5:48 AM, spinergywmy wrote:
Hi guys,
I have posted this question before and this time I found that it
could be
pdfbox problem and this pdfbox I downloaded doesn't use the
log4j.jar. To
index the app 2.13mb pdf file took me 17s and total time to upload
a file is
18s.
So, is there any way or others software than pdfbox to solve the
performance issue.
Thanks.
regards,
Wooi Meng
--
View this message in context: http://www.nabble.com/indexing-
performance-issue-tf2730895.html#a7617155
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org
Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]