http://lucene.apache.org/java/docs/contributions.html lists several PDF alternatives, but I can't speak to their performance. I am sure if you googled PDF converters you could find a fair number of hits.

Perhaps w/ some more details about your app we might be able to find a workaround. We often convert PDFs as a one time offline task (which doesn't get around the fact that you need to do it) so that they don't get in the way when indexing (and reindexing), but we generally deal w/ fixed collections, which may not be your case.

-Grant

On Nov 30, 2006, at 5:48 AM, spinergywmy wrote:


Hi guys,

I have posted this question before and this time I found that it could be pdfbox problem and this pdfbox I downloaded doesn't use the log4j.jar. To index the app 2.13mb pdf file took me 17s and total time to upload a file is
18s.

   So, is there any way or others software than pdfbox to solve the
performance issue.

   Thanks.


regards,
Wooi Meng
--
View this message in context: http://www.nabble.com/indexing- performance-issue-tf2730895.html#a7617155
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to